Abstract
Cybersecurity is crucial in today’s interconnected world, as digital technologies are increasingly used in various sectors. The risk of cyberattacks targeting financial, military, and political systems has increased due to the wide use of technology. Cybersecurity has become vital in information technology, with data protection being a major priority. Despite government and corporate efforts, cybersecurity remains a significant concern. The application of multi-task learning (MTL) in cybersecurity is a promising solution, allowing security systems to simultaneously address various tasks and adapt in real-time to emerging threats. While researchers have applied MTL techniques for different purposes, a systematic overview of the state-of-the-art on the role of MTL in cybersecurity is lacking. Therefore, we carried out a systematic literature review (SLR) on the use of MTL in cybersecurity applications and explored its potential applications and effectiveness in developing security measures. Five critical applications, such as network intrusion detection and malware detection, were identified, and several tasks used in these applications were observed. Most of the studies used supervised learning algorithms, and there were very limited studies that focused on other types of machine learning. This paper outlines various models utilized in the context of multi-task learning within cybersecurity and presents several challenges in this field.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
In today’s digital age, the significance of cybersecurity cannot be overstated. It serves as a critical defense for digital systems, networks, and data, protecting them against unauthorized access, theft, or corruption. The rapid growth of digital technology has made cybersecurity more crucial than ever, as cyber threats can disrupt organizations, starting with personal data breaches to interference with financial transactions, leading to significant financial losses and reputational damage. In the last 5 years, the FBI’s IC3 (Internet Crime Complaint Center) has been consistently registering an annual average of 652,000 complaints. Since 2018, the total number of complaints has reached 3.26 million, resulting in losses amounting to $27.6 billion [1]. Prioritizing cybersecurity is essential for both individuals and organizations to mitigate these threats.
Cybersecurity covers an extensive variety of procedures, methods, and technologies that collaborate to defend against attacks on networks, software, and data’s availability, confidentiality, and integrity. It involves the development of robust security protocols, sophisticated encryption models, and proactive countermeasures. Cyber defense mechanisms operate across hosts, networks, applications, and data. Multiple solutions are available, working side by side to prevent threats and identify security breaches. These include firewalls, anti-virus solutions, intrusion detection systems (IDSs), and intrusion protection systems (IPSs) [2]. A race has persisted between cybercriminals and defenders ever since the discovery of the first computer virus in 1970 [3]. The battle against cybersecurity threats and the challenge of keeping pace with their increasing speed have become demanding overtime.
Recently, cybersecurity experts have become more interested in artificial intelligence (AI) because it can effectively analyze and organize considerable amounts of internet traffic data [4]. According to estimations, the global market size for AI in the cybersecurity sector reached USD 14.9 billion in 2021, with a projected market value of USD 133.8 billion by 2030 [5]. AI and machine learning (ML) techniques are being incredibly integrated into the domain of cybersecurity [6,7,8]. ML is a subset of AI that employs computer programs to learn from historical data for modeling, control, or prediction. It includes reinforcement learning, supervised learning, unsupervised learning and semisupervised learning [9]. Deep learning (DL) relies on the utilization of multiple layers (e.g., convolutional layer and batch normalization layer) and has emerged as a critical component in addressing complex cybersecurity challenges [10]. However, DL is well-known for its extensive data requirements and computational demands, which can be mitigated through the implementation of multi-task learning (MTL).
MTL refers to ML training approach where models are concurrently trained using data from multiple tasks. This is achieved by leveraging shared layers, enabling models to recognize the main correlations across a set of interconnected tasks [11]. MTL initially aims to address the data sparsity problem by aggregating labeled data from all tasks, reducing manual labeling costs, and reusing existing knowledge. As Big Data emerges in different areas of AI such as computer vision and natural language processing (NLP), deep MTL models can provide higher performance than single-task models. MTL utilizes more data from different tasks, learning more robust representations and strong models in terms of overfitting risk and performance [12].
MTL offers several advantages over single-task learning (STL), such as leveraging similarities and relationships between tasks, acting as a regularizer, improving generalization, and reducing the risk of overfitting [11, 13]. By jointly learning multiple tasks, the model can take advantage of further information available in the training data, leading to more robust and accurate predictions. MTL also presents the challenge of limited data, as it requires the model to learn from related tasks, thereby facilitating the transfer of knowledge and enhancing the learning process for individual malware detection tasks [14].
In ML/DL, optimizing a single task can lead to reasonable performance, but it can be costly and difficult to cover edge cases. Taking into consideration that training complicated tasks requires significant computational resources. Multi-task learning can help address these issues by providing more diverse data and reducing training time and resources. Using multiple tasks can provide more data in general and increase diversity in data, thus enhancing the overall performance of the system [15].
Several systematic review studies have explored the application of existing classification algorithms to detect cyber threats. For instance, the study [16] conducted a systematic review of AI and ML techniques for cybersecurity, while [17] presented a systematic review of defensive and offensive cybersecurity with ML. A number of articles have also looked into how ML and DL can be used in specific areas of cybersecurity, such as (i) detecting malware on the Internet of Things [18], (ii) detecting malware on Android mobile devices [19], (iii) cloud security [20], and (iv) detecting phishing [21].
However, these systematic reviews only looked at certain areas of cybersecurity that used a single-task learning approach and failed to consider the potential advantages of MTL. To address this gap, our study aims to provide an overview exploration of the cybersecurity domain utilizing MTL techniques. The primary objective of this paper is to conduct an SLR on MTL in cybersecurity through ten research questions. This review delivers a variety of multi-task models for tasks related to cybersecurity, including network-based and host-based intrusion detection, cyber threat detection, cyberbullying detection, malware detection, and critical infrastructure attack detection.
To the best of our knowledge, this is the first SLR that offers a comprehensive overview of MTL’s application across various cybersecurity application domains.
In this study, we investigate the application of multi-task learning techniques in cybersecurity, following a systematic literature review (SLR) to identify and synthesize relevant research methodologies. This SLR provides a comprehensive overview of the state of the art, supporting our investigation into effective multi-task learning strategies for improving prediction models in cybersecurity. The insights gathered from this SLR study are significant for recognizing new trends and establishing more effective cybersecurity strategies in these critical domains.
The following ten research questions were formulated in this research:
- RQ1:
-
What are the potential applications (e.g., detection of malware, detection of network intrusion) of MTL in the cybersecurity domain?
- RQ2:
-
What advantages does MTL offer for cybersecurity?
- RQ3:
-
What type of tasks are used in MTL-based models?
- RQ4:
-
What types of machine learning techniques (e.g., unsupervised) are used for MTL models in cybersecurity?
- RQ5:
-
What are the most frequently used machine learning and deep learning algorithms for MTL in cybersecurity?
- RQ6:
-
Which model provides the best performance for MTL in cybersecurity?
- RQ7:
-
What datasets are used for evaluating multi-task learning models in cybersecurity?
- RQ8:
-
What kind of evaluation approaches and parameters are used?
- RQ9:
-
Which implementation platforms are used in MTL studies?
- RQ10:
-
What are the challenges and possible solutions for multi-task learning in cybersecurity?
In this study, we investigate the application of multi-task learning techniques in cybersecurity, following a systematic literature review (SLR) to identify and synthesize relevant research methodologies. This SLR provides a comprehensive overview of the state-of-the-art, supporting our investigation into effective multi-task learning strategies for improving prediction models in cybersecurity.
The data extraction from 28 paper and analysis technique utilized in this study is both quantitative synthesis and qualitative analysis methodologies. We recognize that the methods applied for data extraction and analysis can vary across studies; consequently, we have adopted a detailed approach to address this diversity. Our synthesis includes quantitative techniques to systematically assess and summarize numerical data, while also employing qualitative analysis methods to explore the aspects and outcomes within the literature. Furthermore, our review compiles the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) reporting guidelines, ensuring equality in our methodology [22, 23]. For a detailed presentation of our data extraction and analysis procedures, readers are encouraged to refer to the Research Methodology section, where we provide in-depth insight into our approach.
The main contributions of this study are listed as follows:
-
1.
This article presents the first SLR in the literature on the implementation of multi-task learning in cybersecurity and offers insights into existing research, methodologies, and challenges.
-
2.
This research identifies and categorizes five critical applications where MTL is applied, shedding light on specific areas such as network intrusion detection and malware detection. This categorization helps in understanding the diverse applications of MTL in cybersecurity.
-
3.
This SLR not only synthesizes existing knowledge but also evaluates research trends, highlighting the predominant use of supervised learning algorithms and identifying research challenges and potential research areas for future exploration.
Building on the fundamental ideas presented in Sect. 2’s background, this article examines the important facets of multi-task learning (MTL) in cybersecurity in the following sections. Section 2 presents background and related work. In particular, Subsect. 2.1 discuses several cybersecurity issues which are the main focus of our investigation. Subsection 2.2 explores the complexities of multi-task learning, explaining its foundational ideas and cybersecurity applications. Section 3 describes the research methodology, emphasizing the review protocol. Section 4 presents the detailed outcomes obtained from this SLR study, where Subsect. 4.1 elaborates on the primary papers selected, while Subsect. 4.2 provides the detailed answers to the predefined research questions. Subsection 4.3 presents the threats to validity. Section 5 discusses the conclusion and future work.
2 Background and related work
This section explores the foundational aspects of cybersecurity and multi-task learning, as well as an examination of related work in the field. It begins with a brief overview of cybersecurity fundamentals, followed by an exploration of multi-task learning concepts. Later, it reviews relevant studies that employ multi-task learning techniques, providing insights into their methodologies and findings.
2.1 Cybersecurity problems
To briefly address cybersecurity challenges, we will enumerate the most significant types of cyberattacks. These include phishing attacks, impersonation attacks, malware attacks, denial-of-service (DoS) attacks, and hacking and unauthorized access. Additionally, there are specific threats to social media platforms, such as hashtag hijack attacks and black market tweet detection services. Each of these poses unique threats to digital security and integrity, encompassing various forms of unauthorized access, malicious software, misleading strategies, and social manipulation.
A cyberattack refers to an intentional action aimed at compromising the confidentiality, integrity, or availability of IT infrastructures, including their hardware, software, or electronic data. These attacks involve criminal operations leveraging digital technology, including computers, cellphones, the internet, and other digital devices. It is important to note that such attacks not only compromise the CIA triad but also affect other key security properties such as authenticity and non-repudiation [24].
The Cyber Security Breaches Survey, aligned with the UK’s National Cyber Strategy, informs government policy to enhance cyber resilience among businesses, charities, and educational institutions [25]. It examines their cyber security policies, processes, and responses to various cyberattacks and crimes. They provided the most common types of cyberattacks in 2024, which include phishing attacks, impersonation attacks, malware attacks, denial-of-service (DoS) attacks, and hacking and unauthorized access. These attacks vary in their methods and impacts, affecting a significant portion of businesses and charities. Table 1 summarizes these common cyberattacks, detailing their distinct characteristics and impacts.
Cybercrimes cover a wide spectrum of activities, each introducing unique threats to cybersecurity. Network intrusion, for instance, involves unauthorized access to digital networks can lead to loss of valuable resources and cause the risk data security. These intrusions usually go through a series of steps, beginning with gathering information and ending with the compromise of data [26].
Malware, a common type of cyber threat, is software that is specifically created to carry out harmful actions on targeted computers, resulting in disruptions. Some of malware types are viruses, worms ransomware, spyware, adware and scareware [27].
Phishing is a sneaky strategy used by attackers to trick users into giving away important information, like personal details, banking credentials, IDs, and passwords. They do this by pretending to be trustworthy websites from reputable organizations. Phishing attacks come in various forms, such as deceptive phishing and technical subterfuge [28].
Spam, known for its unsolicited and unwanted messages sent in bulk, serves a variety of purposes, from advertising products or services to promoting fraudulent schemes. Spam can be spread across several channels including social media platforms, causing inconvenience and potential risks to users [29].
On the contrary, cyberbullying entails the dissemination of offensive and discriminatory language on various social media platforms. Cyberbullying has the potential to extend beyond personal harm and cause wider social disruptions, and in some cases, it can even play a role in political violence [30].
Social media platforms also encounter unique cyber threats, such as hashtag hijack attacks on mobile social networks. These attacks can disrupt users’ search for relevant content and potentially result in the spread of irrelevant spam or unrelated topics [31]. In addition, Tweet Detection services bring attention to the problem of fabricating content evaluations, such as likes, retweets, and quotes, through unnatural engagement. This brings challenges to the authenticity of online interactions [32, 33].
Furthermore, critical infrastructure systems, including hospitals, telecommunications, energy, banking, finance, and postal sectors, are prime targets for cyberattacks. The definition of a cyberattack on infrastructure can be ambiguous, resulting in the categorization of four distinct types. These types are based on the means of attack (physical or cyber-physical) and the resulting damage (physical or functional) [34].
Researchers increasingly rely on AI techniques, particularly ML and DL methods, to deal with the rising threat of cybercrimes. Recent progress in cybersecurity research have offered valuable insights into emerging threats and effective mitigation strategies. Several significant studies in malware detection, particularly in the field of Obfuscated Memory Malware (OMM) [35], provide a concise and efficient method for identifying new malware in embedded and IoT devices that have limited resources. They achieve this by utilizing hybrid models to outperform existing detection methods. In the same vein, studying denial-of-service (DoS) and distributed denial-of-service (DDoS) attacks [36, 37] offers valuable knowledge on mitigating the effects of disruptive attacks on digital systems.
MTL is a powerful method that deals with many different types of cybercrimes, such as network intrusion, malware, cyber threats, cyberbullying, and vulnerabilities in critical infrastructure. In the Results section (referenced as 4), an in-depth investigation is conducted on the utilization of MTL in these domains.
2.2 Multi-task learning
MTL aims to enhance the performance learning tasks by using shared information between task [38]. MTL can learn multiple output targets based on a single input source, a single output target based on multiple input sources, or a combination of these two approaches [39]. To illustrate the concept of MTL in a practical context, consider the scenario of a security analyst tasked with identifying spam and phishing emails. These are related tasks, but with distinct characteristics:
Spam: Unsolicited bulk advertising.
Phishing: Attempts to trick recipients into revealing personal information or clicking malicious links.
In traditional single-task learning, separate models might be trained for each task. However, with multi-task learning (MTL), a single model can be trained to handle both tasks simultaneously. For instance, the model could analyze email content for common indicators of spam (e.g., keywords, sender information) and phishing (e.g., urgency, suspicious attachments), leveraging shared knowledge between the tasks.
By employing MTL, the model can improve its accuracy in identifying both spam and phishing emails while also benefiting from increased efficiency through shared learning. This example illustrates how MTL can be a valuable approach in cybersecurity for addressing related threats more effectively.
MTL is an approach to ML that leverages information from relevant learning tasks to solve multiple tasks concurrently [40, 41]. By incorporating domain information into the training signals for related tasks, this approach improves generalization by providing an inductive bias [41]. The idea behind this approach is that the knowledge gained from each task can enhance the learning process for other tasks [40]. MTL can be useful when tasks have similarities, but it has also been proven to be advantageous for learning tasks that are not related to each other [42].
MTL is distinct from STL, as shown in Fig. 1, in that each task is handled independently and model parameters are learned separately. MTL depends on the interrelation of tasks to effectively learn them all together. Training signals from related tasks can significantly enhance the learning of model parameters for each task [38, 43]. IT has been proven to upgrade model performance, especially in scenarios with limited training examples and associated tasks [44, 45].
Comparison of single- and multi-task learning frameworks [46]
Various MTL scenarios have been applied, such as multi-task unsupervised learning, multi-task active learning and multi-task reinforcement learning [14]. In the context of multi-task supervised learning, each task involves a supervised learning scenario where models map data instances to corresponding labels. Noteworthy MTL models corresponding to each setting are discussed in [14].
Two key factors that have a significant impact on MTL are the relationship between tasks and how tasks are defined. Understanding the relationships between different tasks and shaping the design of MTL models accordingly is crucial for task-relatedness. Tasks can be classified into different categories, such as supervised tasks like classification and regression and unsupervised tasks like clustering [14]. MTL has been applied in various fields, such as computer vision [47], bioinformatics [48], drug discovery [49], health informatics [50], speech recognition [51], natural language processing [52], and web applications [53], leading to improved application performance.
In the case of DL, MTL is commonly implemented using a shared feature extractor and several task-specific layers. The shared feature extractor processes the input data, while task-specific inputs generate predictions for each task [11].
Two methods for MTL in deep neural networks. a Approach with hard parameter sharing. b Approach with soft parameter sharing [54]
The current approaches MTL in DL are commonly categorized into two distinct groups: hard parameter sharing and soft parameter sharing. Hard parameter sharing is the process of distributing model weights across several tasks so that each weight is trained to mutually minimize the number of loss functions, as shown in Fig. 2 [54]. This approach minimizes the risk of overfitting by compelling the model to capture a representation that fits all tasks simultaneously. For soft parameter sharing, each task is associated with its own specific model that has distinct weights, as illustrated in Fig. 2. However, the joint objective function integrates the distance between the model parameters of different tasks. These architectural decisions align with MTL’s mechanisms, emphasizing the importance of simultaneous learning, preventing overfitting, and optimistic representations to multiple tasks [13].
The variety of MTL’s mechanisms serves to further emphasize its effectiveness. Implicit data augmentation is achieved by concurrently learning multiple tasks, expanding the sample size, and avoiding overfitting. Attention focusing enables the model to focus on relevant features, which is crucial in scenarios with noisy or limited data [40]. Eavesdropping facilitates the learning of complex features by learning from other tasks. Representation bias introduces a preference for representations by multiple tasks which enhancing generalization [55]. MTL acts as a regularizer, reducing the risk of overfitting and minimizing the model’s sensitivity to random noise, thereby enhancing its overall generalization ability. The way these mechanisms work together with the architectural choices made for parameter sharing methods shows a complete way to use MTL in deep neural network [13].
The standard expression for a conventional MTL algorithm [12, 40, 56, 57] is presented in the following equation:
The input vector \(X^m \in \mathbb {R}^{N_m \times D}\) represents the m-th task, whereas the output vector \(y^m \in \mathbb {R}^{N_m \times 1}\) corresponds to the m-th task. The weight vector \(w^m \in \mathbb {R}^{D \times 1}\) denotes the regression parameters for the m-th task, which is utilized to map \(X^m \rightarrow y^m\). The variables \(N^m\), D, and M represent the quantities of samples, features, and tasks, respectively, in the context of input matrices. The regularizer, identified as Reg(W), is used to include prior knowledge of the data and various hypotheses about the interaction between tasks in order to create different constraints on the parameter matrix W. The regularization parameter \(\lambda\) controls the balance between the loss function and the regularizer. If \(\lambda\) is set to zero, the resulting solution does not include any assumptions or prior knowledge about the relatedness of tasks. This approach may only provide sufficient outcomes based on the training set. When \(\lambda\) is set too large, a generic solution that meets the task-relatedness assumption could be produced, but it might not work effectively for every prediction task. The determination of the regularization parameter and other hyper-parameters is often achieved via the use of inner cross-validation using the training samples [58]. Overall, Eq. 1 has two terms, namely the data fidelity term, and the regularization term [12, 59].
2.3 Related work
Several review papers have contributed to a deep understanding of the application of ML and DL in the field of cybersecurity. These reviews have covered various aspects and sub-domains within cybersecurity. These reviews in Table 2 offer a brief understanding of the application of ML and DL techniques in various domains of cybersecurity. They offer valuable insights for researchers, experts, and individuals interested in exploring the ever-changing field of cybersecurity. Remarkably, our extensive review revealed a notable gap in the existing literature. Although there has been extensive research on ML techniques in the field of cybersecurity, we did not come across any systematic literature review paper that specifically focuses on the key problem of multi-task learning in this area.
Our review revealed numerous notable papers that explore the use of ML techniques in cybersecurity. However, it is clear that there is a lack of a comprehensive and focused systematic literature review (SLR) dedicated to investigating the application of ML in this specific domain.
3 Research methodology
In this paper, we systematically reviewed the use of MTL in cybersecurity applications by adopting a methodology based on Kitchenham et al. [64], which is widely recognized for its effectiveness in software engineering research.
3.1 Review protocol
Establishing a defined review protocol was the first phase in our research approach. This step included the formulation of specific research questions that would serve as an outline for our study. The purpose of our study questions was to gain a knowledge of the practical applications, advantages, and challenges of MTL in the field of cybersecurity.
3.2 Data sources and search strategy
A search strategy is a methodical process for locating relevant resources, which involves choosing databases, keywords, and search strings. It guarantees thorough inclusion and reduces bias. In order to collect related research papers, we ran an extensive search across multiple academic databases, which included:
-
Google Scholar
-
Web of Science
-
Scopus
-
Science Direct
-
IEEE Xplore
-
ACM Digital Library
-
Wiley
The selection of these databases was based on their extensive coverage of academic publications and their direct relevance to the domains of cybersecurity and machine learning.
We developed search strings suited to our research questions to ensure comprehensive coverage of relevant studies. The search strings included terms like "Multi-Task Learning," "cybersecurity," "machine learning," "deep learning," "cyber threats," and "cyber attacks." Boolean operators and wildcards were used to refine the searches, ensuring a broad yet focused retrieval of relevant articles. The search process involved multiple iterations to optimize the search strategy, reducing the likelihood of missing important studies and minimizing irrelevant ones.
3.3 Inclusion and exclusion criteria
To manage the large amount of literature, we established specific inclusion and exclusion criteria, as detailed in Table 3. These criteria are predefined rules that determine which studies are suitable for inclusion in the review and which are not. They consider factors such as publication date, study design, study language, outcomes, and their relevance to our research questions. This approach ensures that only the most relevant and high-quality studies are selected for our review.
3.4 Data extraction and synthesis
After selecting the relevant studies, we systematically gathered essential data, including authorship, publication year, type of study, and information answers to our research questions. Following that, we proceeded with data synthesis, which involves integrating and analyzing the data from the selected studies. This process may include statistical techniques or other methods to draw meaningful conclusions, identify patterns, or explore variations in the evidence. The detailed explanation of the data extraction and synthesis process, along with the analysis, is provided in the results section.
3.5 Review process
After implementing the aforementioned selection criteria, a total of 85 publications from various data sources were included for further evaluation. Following the removal of duplicate records, we proceeded to evaluate the titles and abstracts of the remaining 73 publications. The 35 articles that were obtained were carefully examined through a comprehensive study of their complete texts. We obtained A manual search in backward snowballing and forward snowballing articles which led to the discovery of additional articles. Then, as illustrated in Fig. 3, a total of 28 papers were evaluated for their quality and included in the study.
The PRISMA flow diagram [23]
3.6 Quality assessment
As the last step of the research methodology, quality assessment was performed and quality criteria specified in Table 4 were applied to the selected paper. A paper that fully answers the question receives a score of 1. A score of 0.5 indicates a response that only partially meets the specified criterion. A score of 0 signifies inadequacy in fulfilling the quality criterion.
The quality evaluation scores of the selected papers are illustrated in Fig. 4. The x-axis of the graph shows the quality scores of the papers, while the y-axis indicates the frequency of papers corresponding to each level in addition to the threshold for inclusion of a paper was set to 4. Most papers scored above 6 points, indicating high quality. No papers scoring below 4 were included in the final analysis.
4 Results
In this section, we present an analysis of the findings derived in response to the research questions formulated at the beginning of this study. Before presenting the answers, additional details on the identified articles are provided, includes their annual distribution and publication distribution. According to the data shown in Fig. 5, there is a consistent upward trend in the number of papers published each year. Despite concluding our search procedure from 2017 to early 2023, the highest volume of papers occurs in 2020 and begins to rise again in 2022. This trend suggests a growing interest among cybersecurity experts in the application of multi-task learning for cybercrime detection.
4.1 Selected primary studies
This section presents the primary research papers that have been used to answer our 10 research questions. Table 5 presents a compilation of primary articles that have been carefully chosen, along with their respective publication years. We included this table to enhance the repeatability and transparency of our research, providing an overview of the selected primary papers. It is important to note that papers published after our search period are not analyzed in this research paper.
4.2 Response to research questions
4.2.1 RQ1: What are the potential applications of multi-task learning in the cybersecurity domain?
A broad range of potential applications for multi-task learning were brought to light through the selected articles (Fig. 6). In the context of network intrusion detection [66, 68, 71, 74, 79, 80, 84, 86, 88], the majority of approaches consider the classification of network traffic as well as the detection of malicious traffic. Similar to the previous example, the prospective areas for malware detection [69, 70, 73, 75, 78] included Internet of Things malware detection as well as malware classification. One of the areas that Cyber Threat Detection [31,32,33] has extended into is the identification of risks on social media platforms and the detection of tweets related to black markets. Additionally, the identification of hate speech was the primary emphasis of the cyberbullying detection program [72, 77]. A significant amount of diversity was included in the Critical Infrastructure Protection [67, 72, 81,82,83, 85, 87, 89]. This included the detection of cyberattacks in smart grids, the defense of critical infrastructure, the protection of Internet of Things-based systems, the identification of electricity fraud in advanced metering infrastructure (AMI), the implementation of speaker verification, and the detection of spoofing attempts. All of these activities contributed to the strengthening of critical infrastructure in a variety of different ways.
4.2.2 RQ2: What advantages does MTL offer for cybersecurity?
MTL is a highly adaptable and potent method in the field of cybersecurity, offering a range of advantages that significantly strengthen security systems. Through a fully analysis of selected studies, MTL facilitates the simultaneous detection of multiple attack types in addition to support more efficient learning processes by using multiple layers for feature representation [66, 68, 84, 86].
One key benefits of MTL is its ability to handle diverse tasks within cybersecurity, including encrypted traffic classification and network intrusion detection, while adapting various data types as input [69,70,71]. By integrating multiple tasks into an end-to-end training solution, MTL not only addresses different providers’ requirements but also minimizes computational overhead and redundancies through shared feature learning architectures [73, 74, 76, 79]. This holistic approach as it streamlines problem resolution and optimizes resource utilization, it also enhances feature learning capabilities [67, 81].
Moreover, the adaptability of MTL is a significant asset in addressing evolving cybersecurity challenges. By automating alarm region determination and quickly adapting to new types of malware without complete retraining, MTL ensures practical applicability to real-world security scenarios [67, 79, 80]. This dynamic adaptability makes MTL an invaluable tool in the arsenal of cybersecurity professionals, offering resilience and flexibility in the face of rapidly changing threat landscapes.
In basic terms, MTL in cybersecurity improves the efficiency and effectiveness of security systems, providing practical benefits like adaptability, efficiency, and robustness. Through the utilization of shared representations and multiple layers for feature learning, MTL has proven to be a highly adaptable and essential approach in tackling the complex obstacles of cybersecurity.
4.2.3 RQ3: What type of tasks are used?
Various tasks are employed in the selected primary papers to enhance its security. A promising way to support cybersecurity efforts is through multi-tasking, which makes it possible to address multiple security-related issues at once. Table 6 offers a summary of these tasks in which they can be applied to improve threat detection, allocate resources optimally, enhance model accuracy, manage complicated data, and promote cooperation in the cybersecurity domain. The tasks cover a wide range of domains within cybersecurity, each targeting significant aspects of detecting and protecting against threats.
Network intrusion detection, malware detection, and VPN encapsulation detection seek to identify, classify, and analyze potentially malicious activities or anomalies within network traffic. Tasks like Trojan detection and classification aim to spot and categorize specific types of malicious software. Other tasks, such as traffic type recognition, software application classification, and bandwidth prediction, explore network behaviors and predicting network patterns.
Cyber threat detection tasks like text matching, hashtag hijack identification, and cyber intelligence recognition work toward identifying and analyzing threats within textual and social media data. Cyberbullying detection tasks focus on identifying hate speech, offensive language, racism, and sexism within online interactions. Critical infrastructure protection tasks includes graph classification, feature extraction, and image and video analysis to protect critical systems from various threats, including fraud, spoofing, and large-scale data flow classification and analysis. In addition, tasks associated with automatic speaker verification and anti-spoofing address concerns regarding voice authentication and protecting against spoofing attempts. Each of these tasks reflects a crucial domain within cybersecurity, contributing to the overall objective of identifying and safeguarding against threats to the infrastructure.
4.2.4 RQ4: What type of machine learning techniques are used in MTL?
MTL is a widely used ML approach for cybersecurity tasks. It can be built based on the following ML types:
-
Supervised learning: Employs labeled datasets to train models for various tasks.
-
Unsupervised learning: Utilizes techniques such as clustering or anomaly detection, where patterns are identified without the use of labeled data.
-
Reinforcement learning: Uses agents to make sequential decisions in and maximize a cumulative reward through trial and error.
-
Semi-supervised learning: Optimizes model performance using a mix of labeled and unlabeled examples across different tasks. These approaches can help enhance predictive accuracy, uncover patterns and relationships, and adapt to evolving cyber threats.
Figure 7 highlights the prevalent application of supervised learning methodologies in multi-task learning across the cybersecurity field. However, there is a noticeable lack in the use of reinforcement learning, unsupervised learning, and semi-supervised learning methods within this field.
4.2.5 RQ5: What are the most frequently used machine learning and deep learning algorithms for MTL?
In our study, we identified several ML and DL algorithms that are frequently employed to address various security challenges. These algorithms play a crucial role in enhancing the functionality of cybersecurity systems. We categorized these algorithms into ML and DL groups. Figures 8 illustrate the number of research papers that apply ML and deep learning algorithms across each cybersecurity problem.
The ML algorithms include Sequential Minimal Optimization (SMO), Support Vector Machines (SVM), K-Means, Neural Networks (NN), Random Forests (RF), Simple Logistic Regression (SLR), Decision Trees (DT), k-Nearest Neighbors (k-NN), Fuzzy Logic, Naive Bayes (NB), AdaBoost, and Logistic Regression (LR). On the other hand, the DL algorithms include Bidirectional Long Short-Term Memory Networks (BiLSTM), Bidirectional Encoder Representations from Transformers (BERT), Deep Reinforcement Learning (DRL), Autoencoders (AE), Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), Generative Adversarial Networks (GAN), Long Short-Term Memory (LSTM), Recurrent Neural Networks (RNN), Bidirectional Gated Recurrent Unit (BiGRU), Graph Neural Networks (GNN), Convolutional Graph Neural Networks (CGNN), Graph Autoencoders (GAE), Feedforward Neural Networks (NewFF), and Multilayer Perceptrons (MLP).
The main security challenges:
P1: Network intrusion problems.
P2: Malware problems.
P3: Cyber threat problems.
P4: Cyberbullying problems.
P5: Critical infrastructure problems.
As illustrated in Fig. 8a, Support Vector Machine (SVM) appears as the most frequently utilized algorithm for addressing problems P1, P2, P4, and P5. Conversely, for problem P3, Random Forest stands out as the predominant algorithm. In Fig. 8b, Convolutional Neural Network (CNN) takes the lead as the most commonly employed DL algorithm for problems P1 and P5. As for problem P2, CNN, Deep Neural Networks (DNN), and Autoencoder (AE) are the primary algorithms, each applied once. Remarkably, for problem P3, Multi-Layer Perceptron (MLP) emerges as the prevalent DL algorithm. It is worth distinguishing between MLP and DNN, as MLP refers to a specific type of neural network with multiple layers, while DNN is a broader term covering neural networks with a considerable number of layers and various architectures. For problem P4, the BERT (Bidirectional Encoder Representations from Transformers) algorithm takes the majority, and for problem P5, CNN remains the most commonly used algorithm.
4.2.6 RQ6: Which model provides the best performance for MTL?
Each key research study has presented a unique sort of model. Table 7 outlines the various models utilized in multi-task learning within cybersecurity, detailing their specific advantages in building security systems. While most research papers introduce new models with special names, some are simply named as multi-task learning (MTL) models.
As shown in Table 7, it is clear that the majority of multi-task models outperform their single-task models, indicating their efficacy in effectively reducing complexity and efficiently handling multiple tasks.
4.2.7 RQ7: What datasets are used for evaluating MTL models?
In terms of cybersecurity-related challenges, we identified several public datasets and sources in the papers we reviewed. Table 8 presents an overview of these datasets. Following that, we provide a summary of their main characteristics.
Regarding Network Intrusion detection, a wide range of datasets have been utilized: UNSW-NB15 [66, 71, 80], a hybrid dataset merging benign network traffic with simulated cyberattacks across nine different types; CICIDS2017 [66, 69, 76, 80, 84], housing 14 attack categories such as denial-of-service (DoS) attack, distributed denial-of-service (DDoS) attack, web attack, and botnet intrusions; Bot-IoT stands out [71], providing a combination of actual and simulated Internet of Things (IoT) network activity that includes intrusion attempts such as distributed denial-of-service (DDoS) attacks and stealing of information. Additionally, ISCX2012 [71] and the ISCX VPN-nonVPN [79, 84] datasets present scenarios of network infiltration and encrypted traffic detection, respectively.
For malware detection, datasets such as IoT-23 and VARIoT [69] focus on IoT device traffic, while StratosphereIPS and MTA share diverse malware and traffic mixes. The Microsoft Malware Classification Challenge dataset [78] Provides a diverse collection of malware samples organized into distinct groups, presented as disassembled and binary data files, which enhance the complexity of the detection tasks in this extensive collection.
In cyber threat detection, a variety of datasets have been developed: Study [31] constructed a dataset by aggregating microblogs from Weibo, China’s major microblog platform, amassing 11,508 relevant microblogs through API searches and random selection, while study [33] gathered 31,281 tweets. Study [32] collected 2,690 tweets from black market sites and 2,000 genuine tweets from users’ timelines, focusing on non-English tweets and those with adequate length.
In cyberbullying detection, studies [72, 77] used datasets such as ST-Bully, BullySent, and collections obtained from Hatebase.org, Twitter API, and diverse social media platforms. The datasets were selected to include a wide range of online interactions, covering everything from casual conversations to intense debates, in order to accurately capture samples of bullying, hate speech, aggression, and harassment.
Concerning critical infrastructure protection, datasets such as IEEE bus system [65], railway images [81], FaceForensics [83], and ASVspoof 2017 [85] were utilized. These datasets represent a variety of information, including power grid simulations, video manipulation datasets, and transactional communication data. These datasets have played a major part in revealing previously unknown security threats and vulnerabilities. They provide beneficial information that are essential for protecting critical infrastructure from potential risks and attacks.
4.2.8 RQ8: Which evaluation approaches and parameters are used in MTL?
The assessment criteria utilized important metrics essential in cybersecurity analysis, including accuracy, precision, recall, F1 score, specificity, Area under Curve (AUC), false positive and negative rates, mean absolute and squared errors, detection rate, error rate, and task-specific loss functions. The choice of evaluation methods varied depending on the specific cybersecurity tasks and research aims, emphasizing precision, recall, and AUC for precise threat detection while minimizing false positives.
Our findings highlight the diverse range of evaluation approaches and parameters utilized in assessing multi-task learning models in cybersecurity, demonstrating the adaptability and resilience necessary for facing complex and evolving cyber threats. In Fig. 10, we show the frequency of each evaluation parameter used across identified cybersecurity problems. For network intrusion detection (P1), malware detection (P2), and critical infrastructure protection (P5), the accuracy metric was mostly used, whereas cyberbullying detection (P3) leaned toward F1 score, recall, and precision. Conversely, cyber threat detection (P4) mainly utilizes accuracy, F1 score, macro-F1, and weighted-F1 as primary evaluation metrics.
In Fig. 9, the distribution of validation approaches is illustrated, showing a clear inclination toward cross-validation as the preferred method in the majority of papers. Nevertheless, some studies failed to mention the validation approach, potentially affecting the accuracy of experimental studies. This highlights the importance of providing thorough explanations in research papers.
Figures 9 and 10 illustrate that the evaluation metrics (accuracy and F-measure) and cross-validation are commonly used as a preferred validation approach.
4.2.9 RQ9: Which implementation platforms are used to develop MTL models?
We investigated the various implementation platforms employed by research papers to develop and deploy cybersecurity solutions. The choice of implementation platform is a critical consideration as it directly impacts the efficiency and scalability of the solutions. Based on our findings, we can see the implementation platforms that have been used in Fig. 11.
Number of scholarly articles have emphasized the use of open-source ML and DL frameworks, such as TensorFlow (https://www.tensorflow.org/), PyTorch (https://pytorch.org/), and Keras (https://keras.io/), for the implementation of multi-task learning models. These frameworks offer an extensive array of tools, libraries, and pre-trained models, thereby expediting the development process of applications.
In the selected primary research papers, unfortunately, 48% of the studies did not explicitly mention the framework used, while 24% employed Keras, 14% utilized PyTorch, and another 14% employed TensorFlow. The choice of the implementation platform often depends on the specific objectives and requirements of the cybersecurity task at hand. Researchers tend to select platforms that align with factors such as data volume, real-time processing demands, and available resources. Furthermore, considerations pertaining to data privacy, regulatory compliance, and the necessity for robust security measures significantly affect the platform selection process. The widespread adoption of Keras in the selected MTL-based papers can be attributed to its high-level, user-friendly Application Programming Interface (API), modularity allowing compatibility with various DL frameworks like TensorFlow, and its provision of a user-friendly experimental environment, which makes it a popular choice compared to other platforms.
4.2.10 RQ10: What are the challenges and possible solutions for multi-task learning in cybersecurity?
To effectively address the challenges of MTL in the cybersecurity field, we conducted a systematic analysis and synthesis of the challenges identified in the selected paper. This synthesis process enables us to extract the main difficulties and their corresponding solutions, offering a systematic framework for understanding and dealing with the complexity of applying MTL in cybersecurity. In order to respond to the subsequent research question, We provide Table 9, which summarizes the challenges along with their corresponding resolutions across the chosen papers.
4.3 Threats to validity
This systematic literature review aimed to comprehensively examine the existing research on the use of multi-task learning in the field of cybersecurity. The primary objective of this study was to respond to 10 research questions posed at the beginning of this research, determine the obstacles encountered, and evaluate the effectiveness of these ideas using established measures. The findings were aggregated in a manner that successfully addressed the research questions. The identification of difficulties has the potential to facilitate future research work and enhance the development of more effective solutions. Including the datasets used, as well as the ML classifier utilized and its corresponding accuracy, would provide a clear path for future research.
It is possible that some articles were not included in the search results owing to the absence of similar terms being used. Certain observations may have been altered due to variations in the use of terminology across various publications. The involvement of all authors in the article selection process and the subsequent formation of a consensus helped mitigate any bias. In order to mitigate the risks of conclusion validity, the authors of this research drew their findings from conversations during several meetings. Consequently, the individual basis upon which the data were interpreted was minimized.
5 Conclusion and future work
Efficient detection models utilizing multi-task learning algorithms were created through advancements in multi-task algorithms. Despite the large number of models that have been built so far, there are still certain problems that lack clear answers. We conducted a systematic literature review to respond to 10 research questions, investigated 28 high-quality articles, and assessed the applications and implementations of multi-task learning algorithms. We discussed the difficulties and potential solutions in cybercrime detection algorithms based on multi-task learning. A thorough identification was made of the most used multi-task learning algorithms, frequently used datasets, ML categories, development platforms, assessment measures, validation strategies, and data sources. Moreover, research gaps and challenges were provided.
The results of our SLR have significant implications for various fields. In the military sector, it is of utmost importance to guarantee the protection of sensitive data in order to maintain national security. Reliable cybersecurity solutions play an essential role in the financial services industry to protect financial data. Data protection is essential in political systems to mitigate the risk of cyberattacks. This SLR emphasizes many emerging trends in cybersecurity that are applicable to these disciplines.
Our study has also identified several areas for future research and development. These include creating semi-supervised and unsupervised models for diverse detection systems, advancing hybrid multi-task learning architectures, establishing a robust framework for comparative analysis, and addressing specific challenges in identifying and mitigating diverse cyber threats. We emphasize the necessity of larger datasets that are accessible to the public for thorough evaluations and experiments. Furthermore, we recommend that future research should prioritize the creation of innovative detection models based on multi-task learning (MTL) to improve the identification and mitigation of a wide range of cyber threats in different areas of cybersecurity.
Additionally, further research may explore the potential effects of emerging technologies, such as Generative Artificial Intelligence (Gen AI) and Large Language Models (LLMs), on the use of machine learning and deep learning techniques in the field of cybersecurity. Incorporating recently published articles could further enhance the scope of this systematic literature review. An analysis of real-world cybersecurity systems, including case studies, success stories, and lessons learned, could also provide valuable insights into the potential impact of multi-task learning in future.
Data availability
Data are available upon request.
References
Langan T (2022) Internet Crime Report www.ic3.gov
Shaukat K, Luo S, Varadharajan V, Hameed IA, Xu M (2020) A survey on machine learning techniques for cyber security in the last decade. IEEE Access 8:222310–222354. https://doi.org/10.1109/ACCESS.2020.3041951
Szor P (2005) The art of computer virus research and defense. Addison-Wesley Professional
Gümüşbaş D, Yıldırım T, Genovese A, Scotti F (2021) A comprehensive survey of databases and deep learning methods for cybersecurity and intrusion detection systems. IEEE Syst J 15(2):1717–1731. https://doi.org/10.1109/JSYST.2020.2992966
Ansari MF, Dash B, Sharma P, Yathiraju N (2022) The impact and limitations of artificial intelligence in cybersecurity: a literature review. IJARCCE 11:1–2. https://doi.org/10.17148/ijarcce.2022.11912
Taddeo M (2019) Three ethical challenges of applications of artificial intelligence in cybersecurity. Minds Mach 29:187–191. https://doi.org/10.1007/S11023-019-09504-8/METRICS
Sagar R, Jhaveri R, Borrego C (2020) Applications in security and evasions in machine learning: a survey. Electronics. https://doi.org/10.3390/electronics9010097
Shaukat K, Luo S, Varadharajan V, Hameed IA, Chen S, Liu D, Li J (2020) Performance comparison and current challenges of using machine learning techniques in cybersecurity. Energies. https://doi.org/10.3390/en13102509
Abioye SO, Oyedele LO, Akanbi L, Ajayi A, Davila Delgado JM, Bilal M, Akinade OO, Ahmed A (2021) Artificial intelligence in the construction industry: a review of present status, opportunities and future challenges. J Build Eng 44:103299. https://doi.org/10.1016/j.jobe.2021.103299
Macas M, Wu C, Fuertes W (2022) A survey on deep learning for cybersecurity: progress, challenges, and opportunities. Comput Netw 212:109032. https://doi.org/10.1016/j.comnet.2022.109032
Crawshaw M (2020) Multi-task learning with deep neural networks: a survey. CoRR arXiv:2009.09796
Zhang Y, Yang Q (2021) A Survey on multi-task Learning. IEEE Trans Knowl Data Eng 34(12):5586–5609
Ruder S (2017) An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098
Zhang Y, Yang Q (2017) An overview of multi-task learning. Natl Sci Rev 5(1):30–43. https://doi.org/10.1093/nsr/nwx105
Yan A, Wang X, Yang Y, Fox R, Wang X, Gonzalez J (2020) Multi-task learning architectures and applications. Master’s thesis, EECS Department, University of California, Berkeley (May 2020). http://www2.eecs.berkeley.edu/Pubs/TechRpts/2020/EECS-2020-54.html
Ali R, Ali A, Iqbal F, Khattak AM, Aleem S (2020) A systematic review of artificial intelligence and machine learning techniques for cyber security. In: Tian Y, Ma T, Khan MK (eds) Big data and security. Springer, Singapore, pp 584–593
Aiyanyo ID, Samuel H, Lim H (2020) A systematic review of defensive and offensive cybersecurity with machine learning. Appl Sci. https://doi.org/10.3390/app10175811
Ahmad R, Alsmadi I (2021) Machine learning approaches to iot security: a systematic literature review. Internet Things 14:100365. https://doi.org/10.1016/j.iot.2021.100365
Senanayake J, Kalutarage H, Al-Kadri MO (2021) Android mobile malware detection using machine learning: a systematic review. Electronics. https://doi.org/10.3390/electronics10131606
Nassif AB, Talib MA, Nasir Q, Albadani H, Dakalbab FM (2021) Machine learning for cloud security: A systematic review. IEEE Access 9:20717–20735. https://doi.org/10.1109/ACCESS.2021.3054129
Catal C, Giray G, Tekinerdogan B, Kumar S, Shukla S, Tekinerdogan B (2022) Applications of deep learning for phishing detection: a systematic literature review. Knowl Inf Syst 64:1457–1500. https://doi.org/10.1007/s10115-022-01672-x
Kitchenham B, Pearl Brereton O, Budgen D, Turner M, Bailey J, Linkman S (2009) Systematic literature reviews in software engineering - a systematic literature review. Inf Softw Technol 51(1):7–15. https://doi.org/10.1016/j.infsof.2008.09.009
The prisma 2020 statement: an updated guideline for reporting systematic reviews. BMJ 372 (2021) https://doi.org/10.1136/bmj.n71https://www.bmj.com/content/372/bmj.n71.full.pdf
Moore R (2014) Cybercrime: investigating high-technology computer crime. Taylor and Francis, Canada, pp 1–318
Cyber security breaches survey 2024 (2024). https://www.gov.uk/government/statistics/cyber-security-breaches
Khraisat A, Gondal I, Vamplew P, Kamruzzaman J (2019) Survey of intrusion detection systems: techniques, datasets and challenges. Cybersecurity. https://doi.org/10.1186/s42400-019-0038-7
Mvula PK, Branco P, Jourdan G-V, Viktor HL (2023) A systematic literature review of cyber-security data repositories and performance assessment metrics for semi-supervised learning. Discov Data. https://doi.org/10.1007/s44248-023-00003-x
Alkhalil Z, Hewage C, Nawaf L, Khan I (2021) Phishing attacks: a recent comprehensive study and a new anatomy. Front Media SA. https://doi.org/10.3389/fcomp.2021.563060
Jáñez-Martino F, Alaiz-Rodríguez R, González-Castro V, Fidalgo E, Alegre E (2023) A review of spam email detection: analysis of spammer strategies and the dataset shift problem. Artif Intell Rev 56(2):1145–1173
Smith PK, Mahdavi J, Carvalho M, Fisher S, Russell S, Tippett N (2008) Cyberbullying: its nature and impact in secondary school pupils. J Child Psychol Psychiatry 49(4):376–385. https://doi.org/10.1111/j.1469-7610.2007.01846.x
Qu Z, Lyu C, Chi CH (2022) Multi-task learning framework for detecting hashtag hijack attack in mobile social networks. Institute of Electrical and Electronics Engineers Inc., New Jersey, pp 90–98
Arora U, Paka WS, Chakraborty T (2019) Multitask learning for blackmarket tweet detection. Association for Computing Machinery Inc, New York, pp 127–130
Dionísio N, Alves F, Ferreira P, Bessani A (2020) Towards end-to-end cyberthreat detection from twitter using multi-task learning. IEEE Computational Intelligence (IEEE WCCI) 2020
Viganò E, Loi M, Yaghmaei E (2020) Cybersecurity of critical infrastructure. The Ethics of Cybersecurity, pp 157–177
Shafin SS, Karmakar G, Mareels I (2023) Obfuscated memory malware detection in resource-constrained iot devices for smart city applications. Sensors. https://doi.org/10.3390/s23115348
Shafin SS, Prottoy SA, Abbas S, Hakim SB, Chowdhury A, Rashid MM (2021) Distributed denial of service attack detection using machine learning and class oversampling. In: Mahmud M, Kaiser MS, Kasabov N, Iftekharuddin K, Zhong N (eds) Appl Intell Inform. Springer, Cham, pp 247–259
Alghazzawi D, Bamasag O, Ullah H, Asghar MZ (2021) Efficient detection of ddos attacks using a hybrid deep learning model with improved feature selection. Appl Sci. https://doi.org/10.3390/app112411634
Thung KH, Wee CY (2018) A brief review on multi-task learning. Multimedia Tools Appl 77(22):29705–29725
Firdausi I, lim C, Erwin A, Nugroho AS (2010) Analysis of machine learning techniques used in behavior-based malware detection. In: 2010 second international conference on advances in computing, control, and telecommunication technologies, pp 201–203 . https://doi.org/10.1109/ACT.2010.33
Caruana R (1997) Multitask learning. Mach Learn 28:41–75
Evgeniou T, Pontil M (2004) Regularized multi-task learning. KDD ’04. Association for Computing Machinery, New York, NY, USA, pp 109–117
Paredes BR, Argyriou A, Berthouze N, Pontil M (2012) Exploiting unrelated tasks in multi-task learning. In: Lawrence ND, Girolami M (eds.) Proceedings of the fifteenth international conference on artificial intelligence and statistics. Proceedings of machine learning research. PMLR, La Palma, Canary Islands vol. 22, pp 951–959. https://proceedings.mlr.press/v22/romera12.html
Baxter J (2000) A model of inductive bias learning. J Artif Intell Res 12:149–198. https://doi.org/10.1613/jair.731
Ben-David S, Schuller R (2003) Exploiting task relatedness for multiple task learning. In: Schölkopf B, Warmuth MK (eds) Learning theory and kernel machines. Springer, Berlin, Heidelberg, pp 567–580
Thrun S (1995) Is learning the n-th thing any easier than learning the first?. In: Proceedings of the 8th international conference on neural information processing systems. NIPS’95. MIT Press, Cambridge, MA, USA pp 640–646
Park C, Kim Y, Park Y, Kim SB (2018) Multitask learning for virtual metrology in semiconductor manufacturing systems. Comput Ind Eng 123:209–219. https://doi.org/10.1016/j.cie.2018.06.024
Ganin Y, Lempitsky V (2015) Unsupervised domain adaptation by backpropagation. In: Bach F, Blei D (eds.) Proceedings of the 32nd international conference on machine learning. Proceedings of machine learning research. PMLR, Lille, France vol. 37, pp 1180–1189. https://proceedings.mlr.press/v37/ganin15.html
Widmer C, Toussaint NC, Altun Y, Rätsch G (2010) Inferring latent task structure for multitask learning by multiple kernel learning. BMC Bioinform 11:1–8. https://doi.org/10.1186/1471-2105-11-S8-S5/TABLES/3
Ramsundar B, Kearnes S, Riley P, Webster D, Konerding D, Pande V (2015) Massively multitask networks for drug discovery. arXiv preprint arXiv:1502.02072
Widmer C, Leiva J, Altun Y, Rätsch G (2010) Leveraging sequence classification by taxonomy-based multitask learning. In: Berger B (ed) Research in computational molecular biology. Springer, Berlin, Heidelberg, pp 522–534
Deng L, Hinton G, Kingsbury B (2013) New types of deep neural network learning for speech recognition and related applications: an overview. In: 2013 IEEE international conference on acoustics, speech and signal processing, pp 8599–8603 . https://doi.org/10.1109/ICASSP.2013.6639344
Collobert R, Weston J (2008) A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th international conference on machine learning. ICML ’08. Association for Computing Machinery, New York, NY, USA pp 160–167. https://doi.org/10.1145/1390156.1390177
Chapelle O, Shivaswamy P, Vadrevu S, Weinberger K, Zhang Y, Tseng B (2010) Multi-task learning for boosting with application to web search ranking. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’10. Association for Computing Machinery, New York, NY, USA, pp 1189–1198. https://doi.org/10.1145/1835804.1835953
Wang S, Wang Q, Gong M (2020) Multi-task learning based network embedding. Front Neurosci. https://doi.org/10.3389/fnins.2019.01387
Hu Z, Zhao Z, Yi X, Yao T, Hong L, Sun Y, Chi EH (2022) Improving multi-task generalization via regularizing spurious correlation. Adv Neural Inform Proc Syst 35:11450–11466
Argyriou A, Evgeniou T, Pontil M (2006) Multi-task feature learning. In: Schölkopf B, Platt J, Hoffman T (eds.) Advances in neural information processing systems, vol. 19. MIT Press, Cambridge. https://doi.org/10.1145/1835804.1835953
Thung KH, Wee CY (2018) A brief review on multi-task learning. Multimed Tools Appl 77:29705–29725. https://doi.org/10.1007/s11042-018-6463-x
Weiss K, Khoshgoftaar TM, Background DW (2016) A survey of transfer learning. J Big Data. https://doi.org/10.1186/s40537-016-0043-6
Ke GY, Pan Y, Yin J, Huang CQ (2017) Optimizing evaluation metrics for multitask learning via the alternating direction method of multipliers. IEEE Trans Cybern 48(3):993–1006
Idriss I, Azizi M, Moussaoui O (2020) IoT security with deep learning-based intrusion detection systems: a systematic literature review. In: 2020 Fourth international conference on intelligent computing in data sciences (ICDS), IEEE, pp 1–10
Catal C, Giray G, Tekinerdogan B (2022) Applications of deep learning for mobile malware detection: a systematic literature review. Neural Comput Appl 34(2):1007–1032
Wang Z, Liu Q, Chi Y (2020) Review of android malware detection based on deep learning. IEEE Access 8:181102–181126. https://doi.org/10.1109/ACCESS.2020.3028370
Andročec D, Vrček N (2019) Machine learning for the internet of things security: a systematic review. SciTePress, Portugal, pp 563–570
Kitchenham B, Pretorius R, Budgen D, Pearl Brereton O, Turner M, Niazi M, Linkman S (2010) Systematic literature reviews in software engineering - a tertiary study. Inf Softw Technol 52(8):792–805. https://doi.org/10.1016/j.infsof.2010.03.006
Takiddin A, Atat R, Ismail M, Davis K, Serpedin E (2023) A graph neural network multi-task learning-based approach for detection and localization of cyberattacks in smart grids. Institute of Electrical and Electronics Engineers (IEEE), New Jersey, pp 1–5
Albelwi SA (2022) An intrusion detection system for identifying simultaneous attacks using multi-task learning and deep learning. In: 2022 2nd International conference on computing and information technology (ICCIT), IEEE, pp 349–353
Hamdan S, Almajali S, Ayyash M, Salameh HB, Jararweh Y (2023) An intelligent edge-enabled distributed multi-task learning architecture for large-scale iot-based cyber-physical systems. Simul Modell Pract Theory. https://doi.org/10.1016/j.simpat.2022.102685
Liu Q, Wang D, Jia Y, Luo S, Wang C (2022) A multi-task based deep learning approach for intrusion detection. Knowl Based Syst. https://doi.org/10.1016/j.knosys.2021.107852
Ali S, Abusabha O, Ali F, Imran M, Abuhmed T (2023) Effective multitask deep learning for iot malware detection and identification using behavioral traffic analysis. IEEE Trans Netw Serv Manag 20:1199–1209. https://doi.org/10.1109/TNSM.2022.3200741
Bader O, Lichy A, Hajaj C, Dubin R, Dvir A (2022) MalDIST: from encrypted traffic classification to malware traffic detection and classification. In: 2022 IEEE 19th annual consumer communications & networking conference (CCNC), IEEE, pp 527–533
Lan J, Liu X, Li B, Sun J, Li B, Zhao J (2022) Member: a multi-task learning model with hybrid deep features for network intrusion detection. Comput Secur. https://doi.org/10.1016/j.cose.2022.102919
Maity K, Sen T, Saha S, Bhattacharyya P (2022) Mtbullygnn: a graph neural network-based multitask framework for cyberbullying detection. IEEE Trans Comput Soc Syst. https://doi.org/10.1109/tcss.2022.3230974
Bensaoud A, Kalita J (2022) Deep multi-task learning for malware image classification. J Inf Secur Appl 64:103057
Aceto G, Ciuonzo D, Montieri A, Pescapé A (2021) Distiller: encrypted traffic classification via multimodal multitask deep learning. J Netw Comput Appl. https://doi.org/10.1016/j.jnca.2021.102985
Lee SH, Lan SC, Huang HC, Hsu CW, Chen YS, Shieh S (2021) Ec-model: an evolvable malware classification model. In 2021 IEEE Conference on dependable and secure computing (DSC), IEEE, pp 1–8
Barut O, Luo Y, Zhang T, Li W, Li P (2021) Multi-task hierarchical learning based network traffic analytics. In: ICC 2021-IEEE International conference on communications, IEEE, pp 1–6
Kapil P, Ekbal A (2020) A deep neural network based multi-task learning approach to hate speech detection. Knowl Based Syst. https://doi.org/10.1016/j.knosys.2020.106458
Wang S, Wang Q, Jiang Z, Wang X, Jing R (2020) A weak coupling of semi-supervised learning with generative adversarial networks for malware classification. Institute of Electrical and Electronics Engineers Inc., New Jersey, pp 3775–3782
Rezaei S, Liu X (2020) Multitask learning for network traffic classification. In: 2020 29th International Conference on computer communications and networks (ICCCN), IEEE, pp 1–9
Sun L, Zhou Y, Wang Y, Zhu C, Zhang W (2020) The effective methods for intrusion detection with limited network attack data: multi-task learning and oversampling. IEEE Access 8:185384–185398. https://doi.org/10.1109/ACCESS.2020.3029100
Li X, Zhu L, Yu Z, Guo B, Wan Y (2020) Vanishing point detection and rail segmentation based on deep multi-task learning. IEEE Access 8:163015–163025. https://doi.org/10.1109/ACCESS.2020.3019318
Hu T, Guo Q, Shen X, Sun H, Wu R, Xi H (2019) Utilizing unlabeled data to detect electricity fraud in ami: a semisupervised deep learning approach. IEEE Trans Neural Netw Learn Syst 30:3287–3299. https://doi.org/10.1109/TNNLS.2018.2890663
Nguyen HH, Fang F, Yamagishi J, Echizen I (2019) Multi-task learning for detecting and segmenting manipulated facial images and videos. In: 2019 IEEE 10th international conference on biometrics theory, applications and systems (BTAS), IEEE, pp 1–8
Zhao Y, Chen J, Wu D, Teng J, Yu S (2019) Multi-task network anomaly detection using federated learning. Association for Computing Machinery, New York, pp 273–279
Li J, Sun M, Zhang X (2019) Multi-task learning of deep neural networks for joint automatic speaker verification and spoofing detection. In: 2019 Asia-pacific signal and information processing association annual summit and conference ASC, IEEE, pp 1517–1522
Huang H, Deng H, Chen J, Han L, Wang W (2018) Automatic multi-task learning system for abnormal network traffic detection. Int J Emerg Technol Learn 13:4–20. https://doi.org/10.3991/ijet.v13i04.8466
Demertzis K, Iliadis L, Anezakis VD (2018) MOLESTRA: a multi-task learning approach for real-time big data analytics. In: 2018 Innovations in intelligent systems and applications (INISTA), IEEE, pp 1–8
Li B, Lin Y, Zhang S (2017) Multi-task learning for intrusion detection on web logs. J Syst Archit 81:92–100. https://doi.org/10.1016/j.sysarc.2017.10.011
Yu J, Zhang B, Kuang Z, Lin D, Fan J (2017) Iprivacy: image privacy protection by identifying sensitive objects via deep multi-task learning. IEEE Trans Inf Forensics Secur 12:1005–1016. https://doi.org/10.1109/TIFS.2016.2636090
Moustafa N, Slay J (2015) Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In: 2015 military communications and information systems conference (MilCIS), pp 1–6. https://doi.org/10.1109/MilCIS.2015.7348942
IDS 2017 | Datasets | Research | Canadian Institute for Cybersecurity | UNB. https://www.unb.ca/cic/datasets/ids-2017.html
IDS 2018 | Datasets | Research | Canadian Institute for Cybersecurity | UNB. https://www.unb.ca/cic/datasets/ids-2018.html
Garcia S, Gomaa A, Babayeva K. Slips, Behavioral machine learning-based Python IPS. Available at: https://github.com/stratosphereips/StratosphereLinuxIPS/tree/master. Accessed 13 Mar 2024
VPN 2016 | Datasets | Research | Canadian Institute for Cybersecurity | UNB. https://www.unb.ca/cic/datasets/vpn.html
Tor 2016 | Datasets | Research | Canadian Institute for Cybersecurity | UNB. https://www.unb.ca/cic/datasets/tor.html
IDS 2012 | Datasets | Research | Canadian Institute for Cybersecurity | UNB. https://www.unb.ca/cic/datasets/ids.html
The Bot-IoT Dataset | UNSW Research. https://research.unsw.edu.au/projects/bot-iot-dataset
NSL-KDD | Datasets | Research | Canadian Institute for Cybersecurity | UNB. https://www.unb.ca/cic/datasets/nsl.html
The CTU-13 Dataset. A Labeled Dataset with Botnet, Normal and Background traffic. - Stratosphere IPS. https://www.stratosphereips.org/datasets-ctu13
CESNET-QUIC22: a large one-month QUIC network traffic dataset from backbone lines. https://zenodo.org/records/7409924
IoT-23 Dataset: a labeled dataset of Malware and Benign IoT Traffic. - Stratosphere IPS. https://www.stratosphereips.org/datasets-iot23
Janiszewski M, Felkner A, Lewandowski P, Rytel M, Romanowski H (2021) Automatic actionable information processing and trust management towards safer internet of things. Sensors. https://doi.org/10.3390/s21134359
Malware-traffic-analysis.net. https://malware-traffic-analysis.net/
Ronen R (2018) Microsoft malware classification challenge. arXiv preprint arXiv:1802.10135
Davidson T, Warmsley D, Macy M, Weber I (2017) Automated hate speech detection and the problem of offensive language. In: Proceedings of the international AAAI conference on web and social media, vol. 11, pp 512–515
Waseem Z, Hovy D (2016) Hateful symbols or hateful people? Predictive features for hate speech detection on twitter. In: Proceedings of the NAACL student research workshop, pp 88–93
Kumar R, Reganti AN, Bhatia A, Maheshwari T (2018) Aggression-annotated corpus of hindi-english code-mixed data. arXiv preprint arXiv:1803.09402
Zampieri M, Malmasi S, Nakov P, Rosenthal S, Farra N, Kumar R (2019) Predicting the type and target of offensive posts in social media. arXiv preprint arXiv:1902.09666
Golbeck J, Ashktorab Z, Banjo RO, Berlinger A, Bhagwan S, Buntain C, Cheakalos P, Geller AA, Gnanasekaran RK, Gunasekaran RR et al (2017) A large labeled corpus for online harassment research. In: Proceedings of the 2017 ACM on web science conference, pp 229–233
Maity K, Saha S (2021) BERT-capsule model for cyberbullying detection in code-mixed Indian Languages, pp 147–155. https://doi.org/10.1007/978-3-030-80599-9_13
Rössler A, Cozzolino D, Verdoliva L, Riess C, Thies J, Nießner M (2018) Faceforensics: a large-scale video dataset for forgery detection in human faces. arxiv 2018. arXiv preprint arXiv:1803.09179
Rossler A, Cozzolino D, Verdoliva L, Riess C, Thies J, Nießner M (2019) Faceforensics++: learning to detect manipulated facial images. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1–11
| ASVspoof. https://www.asvspoof.org/index2017.html
Acknowledgements
This publication was supported by Qatar University Internal Grant No. QUUG-CENG-CSE-2022. The findings achieved herein are solely the responsibility of the authors.
Funding
Open Access funding provided by the Qatar National Library. Open access funding provided by the Qatar National Library.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Authors declare no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ibrahim, S., Catal, C. & Kacem, T. The use of multi-task learning in cybersecurity applications: a systematic literature review. Neural Comput & Applic 36, 22053–22079 (2024). https://doi.org/10.1007/s00521-024-10436-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-024-10436-3