1 Introduction

In today’s digital age, the significance of cybersecurity cannot be overstated. It serves as a critical defense for digital systems, networks, and data, protecting them against unauthorized access, theft, or corruption. The rapid growth of digital technology has made cybersecurity more crucial than ever, as cyber threats can disrupt organizations, starting with personal data breaches to interference with financial transactions, leading to significant financial losses and reputational damage. In the last 5 years, the FBI’s IC3 (Internet Crime Complaint Center) has been consistently registering an annual average of 652,000 complaints. Since 2018, the total number of complaints has reached 3.26 million, resulting in losses amounting to $27.6 billion [1]. Prioritizing cybersecurity is essential for both individuals and organizations to mitigate these threats.

Cybersecurity covers an extensive variety of procedures, methods, and technologies that collaborate to defend against attacks on networks, software, and data’s availability, confidentiality, and integrity. It involves the development of robust security protocols, sophisticated encryption models, and proactive countermeasures. Cyber defense mechanisms operate across hosts, networks, applications, and data. Multiple solutions are available, working side by side to prevent threats and identify security breaches. These include firewalls, anti-virus solutions, intrusion detection systems (IDSs), and intrusion protection systems (IPSs) [2]. A race has persisted between cybercriminals and defenders ever since the discovery of the first computer virus in 1970 [3]. The battle against cybersecurity threats and the challenge of keeping pace with their increasing speed have become demanding overtime.

Recently, cybersecurity experts have become more interested in artificial intelligence (AI) because it can effectively analyze and organize considerable amounts of internet traffic data [4]. According to estimations, the global market size for AI in the cybersecurity sector reached USD 14.9 billion in 2021, with a projected market value of USD 133.8 billion by 2030 [5]. AI and machine learning (ML) techniques are being incredibly integrated into the domain of cybersecurity [6,7,8]. ML is a subset of AI that employs computer programs to learn from historical data for modeling, control, or prediction. It includes reinforcement learning, supervised learning, unsupervised learning and semisupervised learning [9]. Deep learning (DL) relies on the utilization of multiple layers (e.g., convolutional layer and batch normalization layer) and has emerged as a critical component in addressing complex cybersecurity challenges [10]. However, DL is well-known for its extensive data requirements and computational demands, which can be mitigated through the implementation of multi-task learning (MTL).

MTL refers to ML training approach where models are concurrently trained using data from multiple tasks. This is achieved by leveraging shared layers, enabling models to recognize the main correlations across a set of interconnected tasks [11]. MTL initially aims to address the data sparsity problem by aggregating labeled data from all tasks, reducing manual labeling costs, and reusing existing knowledge. As Big Data emerges in different areas of AI such as computer vision and natural language processing (NLP), deep MTL models can provide higher performance than single-task models. MTL utilizes more data from different tasks, learning more robust representations and strong models in terms of overfitting risk and performance [12].

MTL offers several advantages over single-task learning (STL), such as leveraging similarities and relationships between tasks, acting as a regularizer, improving generalization, and reducing the risk of overfitting [11, 13]. By jointly learning multiple tasks, the model can take advantage of further information available in the training data, leading to more robust and accurate predictions. MTL also presents the challenge of limited data, as it requires the model to learn from related tasks, thereby facilitating the transfer of knowledge and enhancing the learning process for individual malware detection tasks [14].

In ML/DL, optimizing a single task can lead to reasonable performance, but it can be costly and difficult to cover edge cases. Taking into consideration that training complicated tasks requires significant computational resources. Multi-task learning can help address these issues by providing more diverse data and reducing training time and resources. Using multiple tasks can provide more data in general and increase diversity in data, thus enhancing the overall performance of the system [15].

Several systematic review studies have explored the application of existing classification algorithms to detect cyber threats. For instance, the study [16] conducted a systematic review of AI and ML techniques for cybersecurity, while [17] presented a systematic review of defensive and offensive cybersecurity with ML. A number of articles have also looked into how ML and DL can be used in specific areas of cybersecurity, such as (i) detecting malware on the Internet of Things [18], (ii) detecting malware on Android mobile devices [19], (iii) cloud security [20], and (iv) detecting phishing [21].

However, these systematic reviews only looked at certain areas of cybersecurity that used a single-task learning approach and failed to consider the potential advantages of MTL. To address this gap, our study aims to provide an overview exploration of the cybersecurity domain utilizing MTL techniques. The primary objective of this paper is to conduct an SLR on MTL in cybersecurity through ten research questions. This review delivers a variety of multi-task models for tasks related to cybersecurity, including network-based and host-based intrusion detection, cyber threat detection, cyberbullying detection, malware detection, and critical infrastructure attack detection.

To the best of our knowledge, this is the first SLR that offers a comprehensive overview of MTL’s application across various cybersecurity application domains.

In this study, we investigate the application of multi-task learning techniques in cybersecurity, following a systematic literature review (SLR) to identify and synthesize relevant research methodologies. This SLR provides a comprehensive overview of the state of the art, supporting our investigation into effective multi-task learning strategies for improving prediction models in cybersecurity. The insights gathered from this SLR study are significant for recognizing new trends and establishing more effective cybersecurity strategies in these critical domains.

The following ten research questions were formulated in this research:

RQ1:

What are the potential applications (e.g., detection of malware, detection of network intrusion) of MTL in the cybersecurity domain?

RQ2:

What advantages does MTL offer for cybersecurity?

RQ3:

What type of tasks are used in MTL-based models?

RQ4:

What types of machine learning techniques (e.g., unsupervised) are used for MTL models in cybersecurity?

RQ5:

What are the most frequently used machine learning and deep learning algorithms for MTL in cybersecurity?

RQ6:

Which model provides the best performance for MTL in cybersecurity?

RQ7:

What datasets are used for evaluating multi-task learning models in cybersecurity?

RQ8:

What kind of evaluation approaches and parameters are used?

RQ9:

Which implementation platforms are used in MTL studies?

RQ10:

What are the challenges and possible solutions for multi-task learning in cybersecurity?

In this study, we investigate the application of multi-task learning techniques in cybersecurity, following a systematic literature review (SLR) to identify and synthesize relevant research methodologies. This SLR provides a comprehensive overview of the state-of-the-art, supporting our investigation into effective multi-task learning strategies for improving prediction models in cybersecurity.

The data extraction from 28 paper and analysis technique utilized in this study is both quantitative synthesis and qualitative analysis methodologies. We recognize that the methods applied for data extraction and analysis can vary across studies; consequently, we have adopted a detailed approach to address this diversity. Our synthesis includes quantitative techniques to systematically assess and summarize numerical data, while also employing qualitative analysis methods to explore the aspects and outcomes within the literature. Furthermore, our review compiles the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) reporting guidelines, ensuring equality in our methodology [22, 23]. For a detailed presentation of our data extraction and analysis procedures, readers are encouraged to refer to the Research Methodology section, where we provide in-depth insight into our approach.

The main contributions of this study are listed as follows:

  1. 1.

    This article presents the first SLR in the literature on the implementation of multi-task learning in cybersecurity and offers insights into existing research, methodologies, and challenges.

  2. 2.

    This research identifies and categorizes five critical applications where MTL is applied, shedding light on specific areas such as network intrusion detection and malware detection. This categorization helps in understanding the diverse applications of MTL in cybersecurity.

  3. 3.

    This SLR not only synthesizes existing knowledge but also evaluates research trends, highlighting the predominant use of supervised learning algorithms and identifying research challenges and potential research areas for future exploration.

Building on the fundamental ideas presented in Sect. 2’s background, this article examines the important facets of multi-task learning (MTL) in cybersecurity in the following sections. Section 2 presents background and related work. In particular, Subsect. 2.1 discuses several cybersecurity issues which are the main focus of our investigation. Subsection 2.2 explores the complexities of multi-task learning, explaining its foundational ideas and cybersecurity applications. Section 3 describes the research methodology, emphasizing the review protocol. Section 4 presents the detailed outcomes obtained from this SLR study, where Subsect. 4.1 elaborates on the primary papers selected, while Subsect. 4.2 provides the detailed answers to the predefined research questions. Subsection 4.3 presents the threats to validity. Section 5 discusses the conclusion and future work.

2 Background and related work

This section explores the foundational aspects of cybersecurity and multi-task learning, as well as an examination of related work in the field. It begins with a brief overview of cybersecurity fundamentals, followed by an exploration of multi-task learning concepts. Later, it reviews relevant studies that employ multi-task learning techniques, providing insights into their methodologies and findings.

2.1 Cybersecurity problems

To briefly address cybersecurity challenges, we will enumerate the most significant types of cyberattacks. These include phishing attacks, impersonation attacks, malware attacks, denial-of-service (DoS) attacks, and hacking and unauthorized access. Additionally, there are specific threats to social media platforms, such as hashtag hijack attacks and black market tweet detection services. Each of these poses unique threats to digital security and integrity, encompassing various forms of unauthorized access, malicious software, misleading strategies, and social manipulation.

A cyberattack refers to an intentional action aimed at compromising the confidentiality, integrity, or availability of IT infrastructures, including their hardware, software, or electronic data. These attacks involve criminal operations leveraging digital technology, including computers, cellphones, the internet, and other digital devices. It is important to note that such attacks not only compromise the CIA triad but also affect other key security properties such as authenticity and non-repudiation [24].

The Cyber Security Breaches Survey, aligned with the UK’s National Cyber Strategy, informs government policy to enhance cyber resilience among businesses, charities, and educational institutions [25]. It examines their cyber security policies, processes, and responses to various cyberattacks and crimes. They provided the most common types of cyberattacks in 2024, which include phishing attacks, impersonation attacks, malware attacks, denial-of-service (DoS) attacks, and hacking and unauthorized access. These attacks vary in their methods and impacts, affecting a significant portion of businesses and charities. Table 1 summarizes these common cyberattacks, detailing their distinct characteristics and impacts.

Table 1 The most common types of cyberattacks in 2024 UK

Cybercrimes cover a wide spectrum of activities, each introducing unique threats to cybersecurity. Network intrusion, for instance, involves unauthorized access to digital networks can lead to loss of valuable resources and cause the risk data security. These intrusions usually go through a series of steps, beginning with gathering information and ending with the compromise of data [26].

Malware, a common type of cyber threat, is software that is specifically created to carry out harmful actions on targeted computers, resulting in disruptions. Some of malware types are viruses, worms ransomware, spyware, adware and scareware [27].

Phishing is a sneaky strategy used by attackers to trick users into giving away important information, like personal details, banking credentials, IDs, and passwords. They do this by pretending to be trustworthy websites from reputable organizations. Phishing attacks come in various forms, such as deceptive phishing and technical subterfuge [28].

Spam, known for its unsolicited and unwanted messages sent in bulk, serves a variety of purposes, from advertising products or services to promoting fraudulent schemes. Spam can be spread across several channels including social media platforms, causing inconvenience and potential risks to users [29].

On the contrary, cyberbullying entails the dissemination of offensive and discriminatory language on various social media platforms. Cyberbullying has the potential to extend beyond personal harm and cause wider social disruptions, and in some cases, it can even play a role in political violence [30].

Social media platforms also encounter unique cyber threats, such as hashtag hijack attacks on mobile social networks. These attacks can disrupt users’ search for relevant content and potentially result in the spread of irrelevant spam or unrelated topics [31]. In addition, Tweet Detection services bring attention to the problem of fabricating content evaluations, such as likes, retweets, and quotes, through unnatural engagement. This brings challenges to the authenticity of online interactions [32, 33].

Furthermore, critical infrastructure systems, including hospitals, telecommunications, energy, banking, finance, and postal sectors, are prime targets for cyberattacks. The definition of a cyberattack on infrastructure can be ambiguous, resulting in the categorization of four distinct types. These types are based on the means of attack (physical or cyber-physical) and the resulting damage (physical or functional) [34].

Researchers increasingly rely on AI techniques, particularly ML and DL methods, to deal with the rising threat of cybercrimes. Recent progress in cybersecurity research have offered valuable insights into emerging threats and effective mitigation strategies. Several significant studies in malware detection, particularly in the field of Obfuscated Memory Malware (OMM) [35], provide a concise and efficient method for identifying new malware in embedded and IoT devices that have limited resources. They achieve this by utilizing hybrid models to outperform existing detection methods. In the same vein, studying denial-of-service (DoS) and distributed denial-of-service (DDoS) attacks [36, 37] offers valuable knowledge on mitigating the effects of disruptive attacks on digital systems.

MTL is a powerful method that deals with many different types of cybercrimes, such as network intrusion, malware, cyber threats, cyberbullying, and vulnerabilities in critical infrastructure. In the Results section (referenced as 4), an in-depth investigation is conducted on the utilization of MTL in these domains.

2.2 Multi-task learning

MTL aims to enhance the performance learning tasks by using shared information between task [38]. MTL can learn multiple output targets based on a single input source, a single output target based on multiple input sources, or a combination of these two approaches [39]. To illustrate the concept of MTL in a practical context, consider the scenario of a security analyst tasked with identifying spam and phishing emails. These are related tasks, but with distinct characteristics:

Spam: Unsolicited bulk advertising.

Phishing: Attempts to trick recipients into revealing personal information or clicking malicious links.

In traditional single-task learning, separate models might be trained for each task. However, with multi-task learning (MTL), a single model can be trained to handle both tasks simultaneously. For instance, the model could analyze email content for common indicators of spam (e.g., keywords, sender information) and phishing (e.g., urgency, suspicious attachments), leveraging shared knowledge between the tasks.

By employing MTL, the model can improve its accuracy in identifying both spam and phishing emails while also benefiting from increased efficiency through shared learning. This example illustrates how MTL can be a valuable approach in cybersecurity for addressing related threats more effectively.

MTL is an approach to ML that leverages information from relevant learning tasks to solve multiple tasks concurrently [40, 41]. By incorporating domain information into the training signals for related tasks, this approach improves generalization by providing an inductive bias [41]. The idea behind this approach is that the knowledge gained from each task can enhance the learning process for other tasks [40]. MTL can be useful when tasks have similarities, but it has also been proven to be advantageous for learning tasks that are not related to each other [42].

MTL is distinct from STL, as shown in Fig. 1, in that each task is handled independently and model parameters are learned separately. MTL depends on the interrelation of tasks to effectively learn them all together. Training signals from related tasks can significantly enhance the learning of model parameters for each task [38, 43]. IT has been proven to upgrade model performance, especially in scenarios with limited training examples and associated tasks [44, 45].

Fig. 1
figure 1

Comparison of single- and multi-task learning frameworks [46]

Various MTL scenarios have been applied, such as multi-task unsupervised learning, multi-task active learning and multi-task reinforcement learning [14]. In the context of multi-task supervised learning, each task involves a supervised learning scenario where models map data instances to corresponding labels. Noteworthy MTL models corresponding to each setting are discussed in [14].

Two key factors that have a significant impact on MTL are the relationship between tasks and how tasks are defined. Understanding the relationships between different tasks and shaping the design of MTL models accordingly is crucial for task-relatedness. Tasks can be classified into different categories, such as supervised tasks like classification and regression and unsupervised tasks like clustering [14]. MTL has been applied in various fields, such as computer vision [47], bioinformatics [48], drug discovery [49], health informatics [50], speech recognition [51], natural language processing [52], and web applications [53], leading to improved application performance.

In the case of DL, MTL is commonly implemented using a shared feature extractor and several task-specific layers. The shared feature extractor processes the input data, while task-specific inputs generate predictions for each task [11].

Fig. 2
figure 2

Two methods for MTL in deep neural networks. a Approach with hard parameter sharing. b Approach with soft parameter sharing [54]

The current approaches MTL in DL are commonly categorized into two distinct groups: hard parameter sharing and soft parameter sharing. Hard parameter sharing is the process of distributing model weights across several tasks so that each weight is trained to mutually minimize the number of loss functions, as shown in Fig. 2 [54]. This approach minimizes the risk of overfitting by compelling the model to capture a representation that fits all tasks simultaneously. For soft parameter sharing, each task is associated with its own specific model that has distinct weights, as illustrated in Fig. 2. However, the joint objective function integrates the distance between the model parameters of different tasks. These architectural decisions align with MTL’s mechanisms, emphasizing the importance of simultaneous learning, preventing overfitting, and optimistic representations to multiple tasks [13].

The variety of MTL’s mechanisms serves to further emphasize its effectiveness. Implicit data augmentation is achieved by concurrently learning multiple tasks, expanding the sample size, and avoiding overfitting. Attention focusing enables the model to focus on relevant features, which is crucial in scenarios with noisy or limited data [40]. Eavesdropping facilitates the learning of complex features by learning from other tasks. Representation bias introduces a preference for representations by multiple tasks which enhancing generalization [55]. MTL acts as a regularizer, reducing the risk of overfitting and minimizing the model’s sensitivity to random noise, thereby enhancing its overall generalization ability. The way these mechanisms work together with the architectural choices made for parameter sharing methods shows a complete way to use MTL in deep neural network [13].

The standard expression for a conventional MTL algorithm [12, 40, 56, 57] is presented in the following equation:

$$\begin{aligned} w=\min _{\left[ w^1 w^2 \ldots w^M\right] } \sum _{m=1}^M L\left( X^m, y^m, w^m\right) +\lambda {\text {Reg}}(W) \end{aligned}$$
(1)

The input vector \(X^m \in \mathbb {R}^{N_m \times D}\) represents the m-th task, whereas the output vector \(y^m \in \mathbb {R}^{N_m \times 1}\) corresponds to the m-th task. The weight vector \(w^m \in \mathbb {R}^{D \times 1}\) denotes the regression parameters for the m-th task, which is utilized to map \(X^m \rightarrow y^m\). The variables \(N^m\), D, and M represent the quantities of samples, features, and tasks, respectively, in the context of input matrices. The regularizer, identified as Reg(W), is used to include prior knowledge of the data and various hypotheses about the interaction between tasks in order to create different constraints on the parameter matrix W. The regularization parameter \(\lambda\) controls the balance between the loss function and the regularizer. If \(\lambda\) is set to zero, the resulting solution does not include any assumptions or prior knowledge about the relatedness of tasks. This approach may only provide sufficient outcomes based on the training set. When \(\lambda\) is set too large, a generic solution that meets the task-relatedness assumption could be produced, but it might not work effectively for every prediction task. The determination of the regularization parameter and other hyper-parameters is often achieved via the use of inner cross-validation using the training samples [58]. Overall, Eq. 1 has two terms, namely the data fidelity term, and the regularization term [12, 59].

2.3 Related work

Several review papers have contributed to a deep understanding of the application of ML and DL in the field of cybersecurity. These reviews have covered various aspects and sub-domains within cybersecurity. These reviews in Table 2 offer a brief understanding of the application of ML and DL techniques in various domains of cybersecurity. They offer valuable insights for researchers, experts, and individuals interested in exploring the ever-changing field of cybersecurity. Remarkably, our extensive review revealed a notable gap in the existing literature. Although there has been extensive research on ML techniques in the field of cybersecurity, we did not come across any systematic literature review paper that specifically focuses on the key problem of multi-task learning in this area.

Table 2 An overview of review studies

Our review revealed numerous notable papers that explore the use of ML techniques in cybersecurity. However, it is clear that there is a lack of a comprehensive and focused systematic literature review (SLR) dedicated to investigating the application of ML in this specific domain.

3 Research methodology

In this paper, we systematically reviewed the use of MTL in cybersecurity applications by adopting a methodology based on Kitchenham et al. [64], which is widely recognized for its effectiveness in software engineering research.

3.1 Review protocol

Establishing a defined review protocol was the first phase in our research approach. This step included the formulation of specific research questions that would serve as an outline for our study. The purpose of our study questions was to gain a knowledge of the practical applications, advantages, and challenges of MTL in the field of cybersecurity.

3.2 Data sources and search strategy

A search strategy is a methodical process for locating relevant resources, which involves choosing databases, keywords, and search strings. It guarantees thorough inclusion and reduces bias. In order to collect related research papers, we ran an extensive search across multiple academic databases, which included:

  • Google Scholar

  • Web of Science

  • Scopus

  • Science Direct

  • IEEE Xplore

  • ACM Digital Library

  • Wiley

The selection of these databases was based on their extensive coverage of academic publications and their direct relevance to the domains of cybersecurity and machine learning.

We developed search strings suited to our research questions to ensure comprehensive coverage of relevant studies. The search strings included terms like "Multi-Task Learning," "cybersecurity," "machine learning," "deep learning," "cyber threats," and "cyber attacks." Boolean operators and wildcards were used to refine the searches, ensuring a broad yet focused retrieval of relevant articles. The search process involved multiple iterations to optimize the search strategy, reducing the likelihood of missing important studies and minimizing irrelevant ones.

3.3 Inclusion and exclusion criteria

To manage the large amount of literature, we established specific inclusion and exclusion criteria, as detailed in Table 3. These criteria are predefined rules that determine which studies are suitable for inclusion in the review and which are not. They consider factors such as publication date, study design, study language, outcomes, and their relevance to our research questions. This approach ensures that only the most relevant and high-quality studies are selected for our review.

Table 3 The inclusion and exclusion criteria

3.4 Data extraction and synthesis

After selecting the relevant studies, we systematically gathered essential data, including authorship, publication year, type of study, and information answers to our research questions. Following that, we proceeded with data synthesis, which involves integrating and analyzing the data from the selected studies. This process may include statistical techniques or other methods to draw meaningful conclusions, identify patterns, or explore variations in the evidence. The detailed explanation of the data extraction and synthesis process, along with the analysis, is provided in the results section.

3.5 Review process

After implementing the aforementioned selection criteria, a total of 85 publications from various data sources were included for further evaluation. Following the removal of duplicate records, we proceeded to evaluate the titles and abstracts of the remaining 73 publications. The 35 articles that were obtained were carefully examined through a comprehensive study of their complete texts. We obtained A manual search in backward snowballing and forward snowballing articles which led to the discovery of additional articles. Then, as illustrated in Fig. 3, a total of 28 papers were evaluated for their quality and included in the study.

Fig. 3
figure 3

The PRISMA flow diagram [23]

3.6 Quality assessment

As the last step of the research methodology, quality assessment was performed and quality criteria specified in Table 4 were applied to the selected paper. A paper that fully answers the question receives a score of 1. A score of 0.5 indicates a response that only partially meets the specified criterion. A score of 0 signifies inadequacy in fulfilling the quality criterion.

Table 4 Quality assessment criteria

The quality evaluation scores of the selected papers are illustrated in Fig. 4. The x-axis of the graph shows the quality scores of the papers, while the y-axis indicates the frequency of papers corresponding to each level in addition to the threshold for inclusion of a paper was set to 4. Most papers scored above 6 points, indicating high quality. No papers scoring below 4 were included in the final analysis.

Fig. 4
figure 4

The quality score of selected papers

4 Results

In this section, we present an analysis of the findings derived in response to the research questions formulated at the beginning of this study. Before presenting the answers, additional details on the identified articles are provided, includes their annual distribution and publication distribution. According to the data shown in Fig. 5, there is a consistent upward trend in the number of papers published each year. Despite concluding our search procedure from 2017 to early 2023, the highest volume of papers occurs in 2020 and begins to rise again in 2022. This trend suggests a growing interest among cybersecurity experts in the application of multi-task learning for cybercrime detection.

Fig. 5
figure 5

Number of article from 2017 to 2023

4.1 Selected primary studies

This section presents the primary research papers that have been used to answer our 10 research questions. Table 5 presents a compilation of primary articles that have been carefully chosen, along with their respective publication years. We included this table to enhance the repeatability and transparency of our research, providing an overview of the selected primary papers. It is important to note that papers published after our search period are not analyzed in this research paper.

Table 5 The selected primary papers

4.2 Response to research questions

4.2.1 RQ1: What are the potential applications of multi-task learning in the cybersecurity domain?

A broad range of potential applications for multi-task learning were brought to light through the selected articles (Fig. 6). In the context of network intrusion detection [66, 68, 71, 74, 79, 80, 84, 86, 88], the majority of approaches consider the classification of network traffic as well as the detection of malicious traffic. Similar to the previous example, the prospective areas for malware detection [69, 70, 73, 75, 78] included Internet of Things malware detection as well as malware classification. One of the areas that Cyber Threat Detection [31,32,33] has extended into is the identification of risks on social media platforms and the detection of tweets related to black markets. Additionally, the identification of hate speech was the primary emphasis of the cyberbullying detection program [72, 77]. A significant amount of diversity was included in the Critical Infrastructure Protection [67, 72, 81,82,83, 85, 87, 89]. This included the detection of cyberattacks in smart grids, the defense of critical infrastructure, the protection of Internet of Things-based systems, the identification of electricity fraud in advanced metering infrastructure (AMI), the implementation of speaker verification, and the detection of spoofing attempts. All of these activities contributed to the strengthening of critical infrastructure in a variety of different ways.

Fig. 6
figure 6

Multi-task learning applications in cybersecurity

4.2.2 RQ2: What advantages does MTL offer for cybersecurity?

MTL is a highly adaptable and potent method in the field of cybersecurity, offering a range of advantages that significantly strengthen security systems. Through a fully analysis of selected studies, MTL facilitates the simultaneous detection of multiple attack types in addition to support more efficient learning processes by using multiple layers for feature representation [66, 68, 84, 86].

One key benefits of MTL is its ability to handle diverse tasks within cybersecurity, including encrypted traffic classification and network intrusion detection, while adapting various data types as input [69,70,71]. By integrating multiple tasks into an end-to-end training solution, MTL not only addresses different providers’ requirements but also minimizes computational overhead and redundancies through shared feature learning architectures [73, 74, 76, 79]. This holistic approach as it streamlines problem resolution and optimizes resource utilization, it also enhances feature learning capabilities [67, 81].

Moreover, the adaptability of MTL is a significant asset in addressing evolving cybersecurity challenges. By automating alarm region determination and quickly adapting to new types of malware without complete retraining, MTL ensures practical applicability to real-world security scenarios [67, 79, 80]. This dynamic adaptability makes MTL an invaluable tool in the arsenal of cybersecurity professionals, offering resilience and flexibility in the face of rapidly changing threat landscapes.

In basic terms, MTL in cybersecurity improves the efficiency and effectiveness of security systems, providing practical benefits like adaptability, efficiency, and robustness. Through the utilization of shared representations and multiple layers for feature learning, MTL has proven to be a highly adaptable and essential approach in tackling the complex obstacles of cybersecurity.

4.2.3 RQ3: What type of tasks are used?

Various tasks are employed in the selected primary papers to enhance its security. A promising way to support cybersecurity efforts is through multi-tasking, which makes it possible to address multiple security-related issues at once. Table 6 offers a summary of these tasks in which they can be applied to improve threat detection, allocate resources optimally, enhance model accuracy, manage complicated data, and promote cooperation in the cybersecurity domain. The tasks cover a wide range of domains within cybersecurity, each targeting significant aspects of detecting and protecting against threats.

Network intrusion detection, malware detection, and VPN encapsulation detection seek to identify, classify, and analyze potentially malicious activities or anomalies within network traffic. Tasks like Trojan detection and classification aim to spot and categorize specific types of malicious software. Other tasks, such as traffic type recognition, software application classification, and bandwidth prediction, explore network behaviors and predicting network patterns.

Cyber threat detection tasks like text matching, hashtag hijack identification, and cyber intelligence recognition work toward identifying and analyzing threats within textual and social media data. Cyberbullying detection tasks focus on identifying hate speech, offensive language, racism, and sexism within online interactions. Critical infrastructure protection tasks includes graph classification, feature extraction, and image and video analysis to protect critical systems from various threats, including fraud, spoofing, and large-scale data flow classification and analysis. In addition, tasks associated with automatic speaker verification and anti-spoofing address concerns regarding voice authentication and protecting against spoofing attempts. Each of these tasks reflects a crucial domain within cybersecurity, contributing to the overall objective of identifying and safeguarding against threats to the infrastructure.

Table 6 Identified tasks

4.2.4 RQ4: What type of machine learning techniques are used in MTL?

MTL is a widely used ML approach for cybersecurity tasks. It can be built based on the following ML types:

  • Supervised learning: Employs labeled datasets to train models for various tasks.

  • Unsupervised learning: Utilizes techniques such as clustering or anomaly detection, where patterns are identified without the use of labeled data.

  • Reinforcement learning: Uses agents to make sequential decisions in and maximize a cumulative reward through trial and error.

  • Semi-supervised learning: Optimizes model performance using a mix of labeled and unlabeled examples across different tasks. These approaches can help enhance predictive accuracy, uncover patterns and relationships, and adapt to evolving cyber threats.

Fig. 7
figure 7

Machine learning techniques in multi-task Learning for cybersecurity

Figure 7 highlights the prevalent application of supervised learning methodologies in multi-task learning across the cybersecurity field. However, there is a noticeable lack in the use of reinforcement learning, unsupervised learning, and semi-supervised learning methods within this field.

4.2.5 RQ5: What are the most frequently used machine learning and deep learning algorithms for MTL?

In our study, we identified several ML and DL algorithms that are frequently employed to address various security challenges. These algorithms play a crucial role in enhancing the functionality of cybersecurity systems. We categorized these algorithms into ML and DL groups. Figures 8 illustrate the number of research papers that apply ML and deep learning algorithms across each cybersecurity problem.

The ML algorithms include Sequential Minimal Optimization (SMO), Support Vector Machines (SVM), K-Means, Neural Networks (NN), Random Forests (RF), Simple Logistic Regression (SLR), Decision Trees (DT), k-Nearest Neighbors (k-NN), Fuzzy Logic, Naive Bayes (NB), AdaBoost, and Logistic Regression (LR). On the other hand, the DL algorithms include Bidirectional Long Short-Term Memory Networks (BiLSTM), Bidirectional Encoder Representations from Transformers (BERT), Deep Reinforcement Learning (DRL), Autoencoders (AE), Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), Generative Adversarial Networks (GAN), Long Short-Term Memory (LSTM), Recurrent Neural Networks (RNN), Bidirectional Gated Recurrent Unit (BiGRU), Graph Neural Networks (GNN), Convolutional Graph Neural Networks (CGNN), Graph Autoencoders (GAE), Feedforward Neural Networks (NewFF), and Multilayer Perceptrons (MLP).

The main security challenges:

P1: Network intrusion problems.

P2: Malware problems.

P3: Cyber threat problems.

P4: Cyberbullying problems.

P5: Critical infrastructure problems.

As illustrated in Fig. 8a, Support Vector Machine (SVM) appears as the most frequently utilized algorithm for addressing problems P1, P2, P4, and P5. Conversely, for problem P3, Random Forest stands out as the predominant algorithm. In Fig. 8b, Convolutional Neural Network (CNN) takes the lead as the most commonly employed DL algorithm for problems P1 and P5. As for problem P2, CNN, Deep Neural Networks (DNN), and Autoencoder (AE) are the primary algorithms, each applied once. Remarkably, for problem P3, Multi-Layer Perceptron (MLP) emerges as the prevalent DL algorithm. It is worth distinguishing between MLP and DNN, as MLP refers to a specific type of neural network with multiple layers, while DNN is a broader term covering neural networks with a considerable number of layers and various architectures. For problem P4, the BERT (Bidirectional Encoder Representations from Transformers) algorithm takes the majority, and for problem P5, CNN remains the most commonly used algorithm.

Fig. 8
figure 8

Most frequently used machine learning and deep learning algorithms for multi-task learning

4.2.6 RQ6: Which model provides the best performance for MTL?

Each key research study has presented a unique sort of model. Table 7 outlines the various models utilized in multi-task learning within cybersecurity, detailing their specific advantages in building security systems. While most research papers introduce new models with special names, some are simply named as multi-task learning (MTL) models.

Table 7 An overview of MTL models

As shown in Table 7, it is clear that the majority of multi-task models outperform their single-task models, indicating their efficacy in effectively reducing complexity and efficiently handling multiple tasks.

4.2.7 RQ7: What datasets are used for evaluating MTL models?

In terms of cybersecurity-related challenges, we identified several public datasets and sources in the papers we reviewed. Table 8 presents an overview of these datasets. Following that, we provide a summary of their main characteristics.

Table 8 The datasets overview

Regarding Network Intrusion detection, a wide range of datasets have been utilized: UNSW-NB15 [66, 71, 80], a hybrid dataset merging benign network traffic with simulated cyberattacks across nine different types; CICIDS2017 [66, 69, 76, 80, 84], housing 14 attack categories such as denial-of-service (DoS) attack, distributed denial-of-service (DDoS) attack, web attack, and botnet intrusions; Bot-IoT stands out [71], providing a combination of actual and simulated Internet of Things (IoT) network activity that includes intrusion attempts such as distributed denial-of-service (DDoS) attacks and stealing of information. Additionally, ISCX2012 [71] and the ISCX VPN-nonVPN [79, 84] datasets present scenarios of network infiltration and encrypted traffic detection, respectively.

For malware detection, datasets such as IoT-23 and VARIoT [69] focus on IoT device traffic, while StratosphereIPS and MTA share diverse malware and traffic mixes. The Microsoft Malware Classification Challenge dataset [78] Provides a diverse collection of malware samples organized into distinct groups, presented as disassembled and binary data files, which enhance the complexity of the detection tasks in this extensive collection.

In cyber threat detection, a variety of datasets have been developed: Study [31] constructed a dataset by aggregating microblogs from Weibo, China’s major microblog platform, amassing 11,508 relevant microblogs through API searches and random selection, while study [33] gathered 31,281 tweets. Study [32] collected 2,690 tweets from black market sites and 2,000 genuine tweets from users’ timelines, focusing on non-English tweets and those with adequate length.

In cyberbullying detection, studies [72, 77] used datasets such as ST-Bully, BullySent, and collections obtained from Hatebase.org, Twitter API, and diverse social media platforms. The datasets were selected to include a wide range of online interactions, covering everything from casual conversations to intense debates, in order to accurately capture samples of bullying, hate speech, aggression, and harassment.

Concerning critical infrastructure protection, datasets such as IEEE bus system [65], railway images [81], FaceForensics [83], and ASVspoof 2017 [85] were utilized. These datasets represent a variety of information, including power grid simulations, video manipulation datasets, and transactional communication data. These datasets have played a major part in revealing previously unknown security threats and vulnerabilities. They provide beneficial information that are essential for protecting critical infrastructure from potential risks and attacks.

4.2.8 RQ8: Which evaluation approaches and parameters are used in MTL?

The assessment criteria utilized important metrics essential in cybersecurity analysis, including accuracy, precision, recall, F1 score, specificity, Area under Curve (AUC), false positive and negative rates, mean absolute and squared errors, detection rate, error rate, and task-specific loss functions. The choice of evaluation methods varied depending on the specific cybersecurity tasks and research aims, emphasizing precision, recall, and AUC for precise threat detection while minimizing false positives.

Our findings highlight the diverse range of evaluation approaches and parameters utilized in assessing multi-task learning models in cybersecurity, demonstrating the adaptability and resilience necessary for facing complex and evolving cyber threats. In Fig. 10, we show the frequency of each evaluation parameter used across identified cybersecurity problems. For network intrusion detection (P1), malware detection (P2), and critical infrastructure protection (P5), the accuracy metric was mostly used, whereas cyberbullying detection (P3) leaned toward F1 score, recall, and precision. Conversely, cyber threat detection (P4) mainly utilizes accuracy, F1 score, macro-F1, and weighted-F1 as primary evaluation metrics.

Fig. 9
figure 9

Usage of evaluation metrics for the cybersecurity problems

In Fig. 9, the distribution of validation approaches is illustrated, showing a clear inclination toward cross-validation as the preferred method in the majority of papers. Nevertheless, some studies failed to mention the validation approach, potentially affecting the accuracy of experimental studies. This highlights the importance of providing thorough explanations in research papers.

Fig. 10
figure 10

Evaluation approaches

Figures 9 and 10 illustrate that the evaluation metrics (accuracy and F-measure) and cross-validation are commonly used as a preferred validation approach.

4.2.9 RQ9: Which implementation platforms are used to develop MTL models?

We investigated the various implementation platforms employed by research papers to develop and deploy cybersecurity solutions. The choice of implementation platform is a critical consideration as it directly impacts the efficiency and scalability of the solutions. Based on our findings, we can see the implementation platforms that have been used in Fig. 11.

Fig. 11
figure 11

Distribution of selected implementation platforms

Number of scholarly articles have emphasized the use of open-source ML and DL frameworks, such as TensorFlow (https://www.tensorflow.org/), PyTorch (https://pytorch.org/), and Keras (https://keras.io/), for the implementation of multi-task learning models. These frameworks offer an extensive array of tools, libraries, and pre-trained models, thereby expediting the development process of applications.

In the selected primary research papers, unfortunately, 48% of the studies did not explicitly mention the framework used, while 24% employed Keras, 14% utilized PyTorch, and another 14% employed TensorFlow. The choice of the implementation platform often depends on the specific objectives and requirements of the cybersecurity task at hand. Researchers tend to select platforms that align with factors such as data volume, real-time processing demands, and available resources. Furthermore, considerations pertaining to data privacy, regulatory compliance, and the necessity for robust security measures significantly affect the platform selection process. The widespread adoption of Keras in the selected MTL-based papers can be attributed to its high-level, user-friendly Application Programming Interface (API), modularity allowing compatibility with various DL frameworks like TensorFlow, and its provision of a user-friendly experimental environment, which makes it a popular choice compared to other platforms.

4.2.10 RQ10: What are the challenges and possible solutions for multi-task learning in cybersecurity?

To effectively address the challenges of MTL in the cybersecurity field, we conducted a systematic analysis and synthesis of the challenges identified in the selected paper. This synthesis process enables us to extract the main difficulties and their corresponding solutions, offering a systematic framework for understanding and dealing with the complexity of applying MTL in cybersecurity. In order to respond to the subsequent research question, We provide Table 9, which summarizes the challenges along with their corresponding resolutions across the chosen papers.

Table 9 Synthesized challenges and solutions for MTL in cybersecurity

4.3 Threats to validity

This systematic literature review aimed to comprehensively examine the existing research on the use of multi-task learning in the field of cybersecurity. The primary objective of this study was to respond to 10 research questions posed at the beginning of this research, determine the obstacles encountered, and evaluate the effectiveness of these ideas using established measures. The findings were aggregated in a manner that successfully addressed the research questions. The identification of difficulties has the potential to facilitate future research work and enhance the development of more effective solutions. Including the datasets used, as well as the ML classifier utilized and its corresponding accuracy, would provide a clear path for future research.

It is possible that some articles were not included in the search results owing to the absence of similar terms being used. Certain observations may have been altered due to variations in the use of terminology across various publications. The involvement of all authors in the article selection process and the subsequent formation of a consensus helped mitigate any bias. In order to mitigate the risks of conclusion validity, the authors of this research drew their findings from conversations during several meetings. Consequently, the individual basis upon which the data were interpreted was minimized.

5 Conclusion and future work

Efficient detection models utilizing multi-task learning algorithms were created through advancements in multi-task algorithms. Despite the large number of models that have been built so far, there are still certain problems that lack clear answers. We conducted a systematic literature review to respond to 10 research questions, investigated 28 high-quality articles, and assessed the applications and implementations of multi-task learning algorithms. We discussed the difficulties and potential solutions in cybercrime detection algorithms based on multi-task learning. A thorough identification was made of the most used multi-task learning algorithms, frequently used datasets, ML categories, development platforms, assessment measures, validation strategies, and data sources. Moreover, research gaps and challenges were provided.

The results of our SLR have significant implications for various fields. In the military sector, it is of utmost importance to guarantee the protection of sensitive data in order to maintain national security. Reliable cybersecurity solutions play an essential role in the financial services industry to protect financial data. Data protection is essential in political systems to mitigate the risk of cyberattacks. This SLR emphasizes many emerging trends in cybersecurity that are applicable to these disciplines.

Our study has also identified several areas for future research and development. These include creating semi-supervised and unsupervised models for diverse detection systems, advancing hybrid multi-task learning architectures, establishing a robust framework for comparative analysis, and addressing specific challenges in identifying and mitigating diverse cyber threats. We emphasize the necessity of larger datasets that are accessible to the public for thorough evaluations and experiments. Furthermore, we recommend that future research should prioritize the creation of innovative detection models based on multi-task learning (MTL) to improve the identification and mitigation of a wide range of cyber threats in different areas of cybersecurity.

Additionally, further research may explore the potential effects of emerging technologies, such as Generative Artificial Intelligence (Gen AI) and Large Language Models (LLMs), on the use of machine learning and deep learning techniques in the field of cybersecurity. Incorporating recently published articles could further enhance the scope of this systematic literature review. An analysis of real-world cybersecurity systems, including case studies, success stories, and lessons learned, could also provide valuable insights into the potential impact of multi-task learning in future.