3 AI To Improve Intrusion Detection
3 AI To Improve Intrusion Detection
Christina Ampatzi
Attacks on computer systems have been a persistent and unwanted problem since computers were
first created. Despite using various security measures, the threats to security have increased as
technology has advanced. With the world relying heavily on computers, it's crucial to protect them
from harmful activities that could harm their functioning. To secure computer systems, especially
within networks, Intrusion Detection Systems (IDS) and Intrusion Prevention Systems (IPS) are
commonly used. These systems are essential for making sure that computer environments are safe
and secure, especially as security challenges grow. As technology progresses, incorporating
Artificial Intelligence (AI) into IDS and IPS technologies makes them even more effective in
detecting and preventing intrusions. This strengthens the overall security of computer systems.
The goal of this research is to explore how AI methods can enhance IDS and IPS systems.
2
Table of Contents
1.Introduction ......................................................................................................................................... 5
1.1Overview of the Field ...................................................................................................................... 5
1.2Motivation ...................................................................................................................................... 6
1.3Thesis Aim ...................................................................................................................................... 7
1.4Scope.............................................................................................................................................. 7
1.5Limitations ...................................................................................................................................... 8
1.6Thesis Outline ................................................................................................................................. 8
2.Background .......................................................................................................................................... 9
2.1Intrusion Detection System ............................................................................................................. 9
2.1.1 Network-Based Intrusion Detection System ........................................................................... 10
2.1.2 Host-Based Intrusion Detection System ................................................................................. 11
2.2Intrusion prevention system.......................................................................................................... 11
2.3Types of intrusion detection.......................................................................................................... 12
2.3.1Signature-Based Intrusion Detection ...................................................................................... 12
2.3.2Anomaly-Based Intrusion Detection ....................................................................................... 13
3.Research Methodology ....................................................................................................................... 14
3.1Choice of research method ............................................................................................................ 14
3.1.1Systematic literature review ................................................................................................... 15
3.1.2Semi-systematic literature review ........................................................................................... 15
3.1.3Integrative literature review .................................................................................................... 15
3.2Literature review processes ........................................................................................................... 16
3.3Define concept and scope ............................................................................................................. 16
3.4Search for relevant literature ........................................................................................................ 16
3.5Literature search ........................................................................................................................... 16
3.6Analysis and synthesis................................................................................................................... 16
3.7Review and combination of literature ........................................................................................... 17
4.Datasets ............................................................................................................................................. 18
4.1KDDCup99 .................................................................................................................................... 18
4.2CICIDS2017 ................................................................................................................................... 19
4.3NSL-KDD ....................................................................................................................................... 20
5.Attacks ............................................................................................................................................... 21
6.Metrics ............................................................................................................................................... 23
7.Machine learning................................................................................................................................ 25
7.1Machine Learning Classification of Problems................................................................................. 26
3
7.2How ML works .............................................................................................................................. 26
7.3ML Learning Methods ................................................................................................................... 27
7.3.1Supervised learning................................................................................................................ 27
7.3.2Unsupervised learning ........................................................................................................... 27
8.Supervised Learning ........................................................................................................................... 28
8.1Decision Tree ................................................................................................................................ 30
8.2Random Forest ............................................................................................................................. 32
8.3Support Vector Machines .............................................................................................................. 34
8.4Naïve Bayes. ................................................................................................................................. 38
9.Unsupervised Learning ....................................................................................................................... 40
9.1Clustering ..................................................................................................................................... 41
9.2Principal Components Analysis...................................................................................................... 42
9.3Isolation-Forest ............................................................................................................................. 43
9.4One-Class Support Vector Machine ............................................................................................... 45
9.5Auto-Encoder................................................................................................................................ 46
10.Discussion ........................................................................................................................................ 49
11.Future Work ..................................................................................................................................... 51
12.Conclusion ........................................................................................................................................ 52
References............................................................................................................................................. 53
4
1. Introduction
1.1 Overview of the Field
Attackers constantly adapt and innovate their tactics, making it essential for organizations and
individuals to continually enhance their cybersecurity measures to defend against these evolving
threats. These threats include various types of attacks such as malware, phishing, ransomware,
DDoS (Distributed Denial of Service) attacks, and social engineering. Organizations, both public
and private, store and transmit sensitive information on a massive scale. It is crucial for
organizations to protect data from attacks by ensuring the confidentiality, integrity, and availability
of the information. Intrusion Detection Systems (IDS) and Intrusion Prevention Systems (IPS) are
cybersecurity tools aimed at safeguarding computer networks from potential threats. IDS monitors
network or system activities, analyzing data for patterns indicative of known threats and alerting
administrators to potential issues. On the other hand, IPS, an advanced iteration of IDS, not only
identifies suspicious activities but also takes proactive measures to prevent or block them. Unlike
IDS, IPS can automatically respond to threats by blocking network traffic or implementing other
preventive actions. The distinction lies in their response mechanisms, with IDS providing alerts
for human intervention and IPS actively intervening in real-time to mitigate potential risks.
Together, IDS and IPS contribute to a robust cybersecurity strategy by detecting and, in the case
of IPS, actively preventing unauthorized access and malicious activities on networks and systems
[1].
5
1.2 Motivation
Intrusion Prevention Systems and Intrusion Detection Systems play a critical role in safeguarding
networks against a large number of cyber threats. As the complexity and frequency of cyber-
attacks continue to rise, there is an escalating demand for advanced techniques to enhance the
efficacy of these systems. Traditional IPS and IDS solutions often struggle to keep pace with the
evolving landscape of cyber threats, primarily due to their reliance on static anomaly-based
approaches and signature-based detection mechanisms.
The emergence of Artificial Intelligence presents a promising avenue for revolutionizing IPS and
IDS. By harnessing the power of AI, particularly supervised and unsupervised learning methods,
these systems can evolve beyond their static nature to become more adaptive, intelligent, and
proactive in threat detection and prevention [5] [6]. Cyber threats are becoming increasingly
sophisticated, employing stealthy techniques to evade traditional detection methods. Signature-
based approaches are often ineffective against zero-day attacks and polymorphic malware. AI-
driven IPS and IDS can adapt to the dynamic threat landscape by learning from past attacks and
identifying anomalous patterns that may signify potential threats.
Furthermore, supervised learning algorithms, such as Support Vector Machines (SVM) and Neural
Networks, offer superior detection accuracy compared to anomaly-based systems. By training on
labeled datasets containing examples of both normal and malicious network traffic, these
algorithms can generalize patterns and effectively distinguish between benign and malicious
activities [5]. Unsupervised learning methods, such as clustering and anomaly detection,
complement supervised techniques by identifying novel threats without the need for labeled data
[6].
Additionally, one of the challenges faced by traditional IPS and IDS solutions is the generation of
false positives, which can inundate security teams with irrelevant alerts and divert their attention
from genuine threats. AI-powered systems can mitigate this issue by refining their detection
capabilities over time, thereby reducing false alarms, and enabling security personnel to focus on
critical incidents [7].
Unlike static rule-based systems that require manual updates to accommodate new threats, AI-
driven IPS and IDS can autonomously learn and adapt to emerging attack techniques. Supervised
learning models can be continuously trained on new data to stay abreast of evolving threats, while
unsupervised learning techniques excel at identifying previously unseen patterns indicative of
suspicious behavior [5] [6].
Finally, AI algorithms can process vast amounts of network data in real-time, making them well-
suited for high-speed and large-scale environments. By automating the analysis of network traffic
and prioritizing alerts based on their severity, AI-powered IPS and IDS can improve the efficiency
of security operations and enable rapid response to potential breaches [7].
6
1.3 Thesis Aim
The aim of this thesis is to explore the impact of artificial intelligence on the design and operation
of intrusion detection mechanisms, specifically Intrusion Prevention System and Intrusion
Detection System. In recent years, advancements in AI have revolutionized the field of
cybersecurity, offering new approaches and techniques for detecting and mitigating security
threats. Through an extensive review of existing literature and research articles from databases
such as IEEE Xplore, the thesis aims to provide insights into how AI has influenced the
development and functionality of IPS and IDS. By analyzing a wide range of studies, the thesis
seeks to identify key AI methodologies, algorithms, and frameworks that have been applied in IPS
and IDS. Furthermore, it aims to explore the strengths and limitations of these AI-based
approaches in effectively detecting and responding to various types of cyber threats.
Another important aspect of this thesis is to delve into the design principles and operational
mechanisms of AI-enabled IPS and IDS. Through a critical analysis of existing literature, the
research aims to uncover the architectural components, feature extraction techniques, and decision-
making processes employed in AI-driven intrusion detection systems. Additionally, it seeks to
evaluate the performance metrics and effectiveness of these systems in real-world scenarios. By
synthesizing the findings from diverse sources, this thesis aims to contribute to the body of
knowledge surrounding AI-based intrusion detection. It seeks to provide valuable insights for
cybersecurity professionals, researchers, and policymakers on the evolution of IPS and IDS in the
era of AI.
In summary, this thesis aims to provide a comprehensive overview of how AI technologies have
influenced the operation of IPS and IDS. Through a semi-systematic review of existing literature,
it aims to shed light on the opportunities, challenges, and implications of integrating AI into
intrusion detection mechanisms.
1.4 Scope
This thesis focuses on exploring the impact of artificial intelligence on the operation of Intrusion
Prevention Systems and Intrusion Detection Systems. The scope of the research encompasses an
extensive review of existing literature and research articles from reputable databases such as IEEE
Xplore, Research Gate, etc. with the aim of understanding how AI technologies have influenced
the field of intrusion detection.
Within the scope of this thesis, various AI methodologies and algorithms applied in IPS and IDS
will be examined. This includes supervised and unsupervised machine learning algorithms used
for anomaly detection, pattern recognition, and threat classification. The research will delve into
the strengths and limitations of these AI-based approaches in enhancing the effectiveness of
intrusion detection mechanisms.
7
1.5 Limitations
It is important to acknowledge several limitations in this research. Firstly, the scope of the thesis
is confined to a literature review, which means that no experimental studies are conducted to
validate the findings. Therefore, while the research provides valuable insights into the existing
literature on AI-driven IPS and IDS, the conclusions drawn may not fully capture the real-world
performance and effectiveness of these systems.
Furthermore, the scope is limited to AI technologies in the context of IPS and IDS, and does not
cover other aspects of cybersecurity, such as network security, endpoint protection, or data privacy.
As a result, the findings may not be generalizable to other areas within the cybersecurity domain.
Lastly, the rapid pace of technological advancements in AI and cybersecurity means that some of
the information presented in the thesis may become outdated over time. Therefore, the research
should be considered within the context of the current state of the field at the time of writing, with
the understanding that new developments may emerge in the future. Despite these limitations, the
research aims to provide a comprehensive overview of AI's influence on intrusion detection and
identify opportunities for further research and development in this area.
The structure of this thesis is organized as follows: Chapter 1 begins with an introduction to the
thesis. Chapter 2 provides background information on Intrusion Prevention Systems (IPS) and
Intrusion Detection Systems (IDS), including methods such as signature-based and anomaly-based
detection. Chapter 3 details the methodology of this research, explaining how data was treated and
processed. Chapter 4 discusses the most commonly used datasets in IPS and IDS for identifying
attacks. Chapter 5 examines the various types of attacks that can be identified by these systems.
Chapter 6 outlines the metrics researchers use to evaluate the effectiveness of different algorithms.
Chapter 7 explores the impact of machine learning on the operations of IPS and IDS. Chapters 8
and 9 delve into supervised and unsupervised learning algorithms, respectively. Chapter 10
discusses the overall impact of AI algorithms on the IPS and IDS processes. Chapter 11 suggests
potential future work to further explore this topic. Finally, Chapter 12 concludes the thesis,
summarizing the key findings and implications.
8
2. Background
Intrusion detection involves monitoring network or system activities for malicious actions or
policy violations. This process employs various techniques, including signature-based and
anomaly-based methods. Intrusion Detection Systems are specialized tools designed to detect
unauthorized access, data breaches, and other security threats. IDS analyze network traffic, system
logs, and other sources to identify potential security breaches. They can be categorized into two
main types: signature-based and anomaly-based [7]. Intrusion Prevention Systems build upon the
capabilities of intrusion detection by actively blocking malicious activity. IPS utilize threat
intelligence, signature matching, and anomaly detection techniques to identify and prevent
potential threats before they cause harm [8].
The increasing threat landscape of cyberattacks underscores the urgent necessity for robust
security measures, leading to the development of the Intrusion Detection System. Intrusion
detection involves the continuous monitoring of computer networks and systems to detect events
that deviate from established security protocols. Introduced in the 1980s, intrusion detection
technology serves as a defensive mechanism, gathering vital data from computers for analysis to
identify any network breaches violating security policies. This system is specifically designed to
reliably detect and prevent any unauthorized activities within a network or system [9].
As the network environment grows increasingly complex, intrusion detection systems have
become widely adopted. An IDS functions by continuously monitoring and analyzing various
activities, detecting signs of malicious or unauthorized behavior. Its primary objective is to
promptly identify and respond to security incidents in real-time, mitigating potential or ongoing
attacks and implementing measures to prevent unauthorized intrusion. To achieve this, it generates
alerts notifying host or network administrators of detected malicious behavior [7].
Intrusion Detection Systems are broadly classified into two categories: network-based intrusion
detection systems (NIDS) and host-based intrusion detection systems (HIDS). Each of these
classifications is tailored to address specific types of intrusive behaviors, offering comprehensive
protection against a range of security threats [10]. The purpose of an IDS is to identify security
incidents such as unauthorized access attempts, malware, or suspicious network traffic and
promptly notify administrators. It can detect potential attacks by analyzing log files, system
activities, network packets, and other data sources. There are two detection methods that IDS
employs to recognize potential threats: Signature-based detection and Anomaly-based detection
[11].
During an intrusion scenario, the network system undergoes a structured detection process
encompassing four critical stages. Initially, data collection is initiated, wherein initial user
information and essential network data are retrieved using external sensors or proxy hosts.
Subsequently, collected data undergoes processing to standardize various types of information into
a uniform format for computer interpretation. Following this, the processed data is subjected to
thorough analysis, involving statistical evaluation and pattern matching to identify potential threats
9
to the system. Upon detection of threats, the system transmits relevant information to the control
module for further action. Finally, in the response processing stage, the system matches the
identified threats with predefined rules and executes appropriate response measures to mitigate the
detected intrusions effectively [12].
Network Intrusion Detection Systems utilize two main detection approaches: signature-based
detection and anomaly-based detection. Signature-based detection, also referred to as knowledge-
based or misuse detection, relies on predefined rules, also known as signatures, for known attacks.
These signatures are compared with network connection patterns to identify and block any
malicious connections. While effective for detecting known attacks, this approach struggles with
identifying new attack types lacking predefined signatures. Moreover, it demands significant
resources due to the storage and comparison of numerous signatures against incoming and
outgoing traffic [13].
Anomaly-based detection focuses on detecting differences from normal network behavior. This
method creates a baseline that represents the typical behavior of network traffic against which
incoming connections are compared to detect anomalies. Unlike signature-based detection,
anomaly-based systems can identify both known and unknown attacks, making them adaptable to
evolving threats. Consequently, IDS systems employing anomaly-based detection are better
equipped to handle differences and constantly changing attack scenarios [13].
10
2.1.2 Host-Based Intrusion Detection System
The Host-Based Intrusion Detection System (HIDS) functions on individual host devices,
overseeing activities such as file changes and login attempts. Additionally, it scrutinizes pertinent
log files on the host, including those related to the kernel, system, server, and network. To enhance
its efficacy, HIDS compares its observations with records of previous attacks stored on the server.
In doing so, the system identifies and alerts on intrusions and suspicious behaviors occurring at
the specific host level [10]. Figure 2 depicts the architecture of HIDS.
Implementing host-based IDS typically involves selecting a metric present in the host and using it
as input for a decision engine. Some techniques rely on information extracted from various log
files within a computer. However, utilizing log files has its limitations. Firstly, log files provide
interpreted data, generated by daemon programs monitoring system activity, which inherently
offers a diluted data source. Secondly, as highlighted in previous research, log file production is
not seamless, often resulting in a large volume of potentially irrelevant data being treated with
equal priority to critical information. Managing and creating log files also presents mechanical
challenges, further complicating the process [14].
11
Essentially, an IPS acts as an advanced form of an Intrusion Detection System, adding an extra
layer of security by actively blocking potential threats instead of just alerting administrators. It's
an intelligent system capable of not only identifying malicious activities but also implementing
preventive measures to safeguard the network or individual hosts [16].
At its core, an IPS employs sophisticated detection mechanisms to identify potential threats. One
method is signature-based detection, where the system compares network traffic against a database
of known attack patterns or signatures. These signatures represent specific types of attacks, such
as viruses, worms, or denial-of-service (DoS) attacks. When a match is found, IPS blocks
immediately all incoming packets from the suspicious sender [13].
However, IPS goes beyond just signature matching. It also incorporates anomaly detection
techniques to identify deviations from normal network behavior. By establishing a baseline of what
constitutes typical activity within the network, the IPS can flag any abnormal patterns that may
indicate a potential attack. This proactive approach is particularly useful in detecting previously
unknown or zero-day attacks, where no specific signature exists. When suspicious activity is
detected, the IPS responds swiftly to mitigate the threat. Depending on its configuration, the IPS
can generate alerts to notify security personnel, log the event for further analysis, and most
importantly, take automated actions to block or prevent the malicious traffic from reaching its
intended target. These actions could include dropping packets, blocking connections, or even
reconfiguring firewalls to close off access.
Overall, an IPS plays a critical role in enhancing network security by providing real-time threat
detection and prevention capabilities. By combining signature-based detection with anomaly
detection and automated response mechanisms, IPS helps organizations defend against a wide
range of cyber threats, ensuring the integrity and availability of their networks and systems.
Intrusion detection is crucial for identifying and responding to security threats within networks
and systems. Two main methods, signature-based and anomaly-based, are employed by Intrusion
Detection Systems to achieve this. Signature-based detection depends on recognizing established
attack patterns, whereas anomaly-based detection actively identifies deviations from typical
behavior.
One of the key strengths of anomaly-based detection is its ability to detect novel and sophisticated
attacks that may evade signature-based detection. By continuously monitoring network traffic,
system logs, and user behavior, anomaly-based IDS can identify unusual patterns or behaviors
indicative of potential security threats, such as insider attacks, zero-day exploits, or targeted
intrusions [11]. However, anomaly-based detection may also produce false positives if legitimate
activity deviates from the established baseline. Additionally, it requires careful tuning and
configuration to differentiate between normal variations in network or system behavior and
genuine security threats [18].
Despite these challenges, anomaly-based intrusion detection plays a crucial role in augmenting
overall security posture. By complementing signature-based detection, anomaly-based IDS
provides an additional layer of defense against emerging threats and helps organizations
proactively respond to potential security incidents before they escalate. Combined with other
detection methods, anomaly-based detection enhances the ability of security teams to detect and
mitigate a wide range of cyber threats effectively.
13
3. Research Methodology
This chapter gives an insight into the methodological approach and research design chosen for this
study. Within this context, the methodology outlined in the following points was deemed most
suitable for understanding how AI, using various algorithms and methods, can detect intrusions
within the concept of IPS and IDS. Additionally, the chapter offers explanations for the selected
methods and elaborates on the analysis processes and research instruments used in greater detail.
Academic research across all disciplines is inherently based on and interconnected with existing
knowledge. Therefore, the precise advancement of this knowledge should be a fundamental
priority for all scholars. Nevertheless, this challenge has grown notably intricate. In the domain of
information security, the knowledge is spreading rapidly, and the research becomes more
challenging still it remains dispersed across various fields and disciplines. This presents challenges
in staying abreast with state of art research, evaluating the combined evidence in specific research
domains, and maintaining a leading position in the field. Hence, the choice of literature review as
a research method is increasingly pertinent.
In order for literature review research to be considered as a proper research methodology, the
several stages that one must follow to guarantee accuracy, precision, and reliability. Like any other
research the quality of an academic review is determined by the methodology employed, the
findings obtained, and the clarity of reporting. The choice of methodology for conducting literature
review depends on the purpose at hand. There are various established frameworks for literature
reviews, each with its own merits and suitability for specific objectives. These methods may take
on qualitative, quantitative, or mixed forms depending on the stage of the review process. Table 1
depicts the different literature review approaches [19].
14
3.1.1 Systematic literature review
Systematic reviews are predominantly employed in medical science to consolidate research
findings; however, their utilization is not as common in business research. A systematic review
can be defined as a research approach and process for identifying and critically evaluating pertinent
research, as well as for gathering and analyzing data from that research. The objective of a
systematic review is to identify all empirical evidence meeting predefined inclusion criteria to
address a specific research question or hypothesis [20].
15
3.2 Literature review processes
This section outlines the key steps involved in conducting a literature review, from defining the
research scope to synthesizing and analyzing the findings. By following these processes, I could
systematically gather, evaluate, and integrate relevant literature to support their research
objectives.
16
3.7 Review and combination of literature
The findings from the literature review on analyzing how AI has impacted the intrusion detection
mechanisms followed by a review of the work, including an explanation of the methodology used
and the resulting debates. Figure 3 depicts the process of the literature process that was used for
this thesis.
17
4. Datasets
Researchers use datasets to train machine learning algorithms. For supervised learning, the labeled
data in these datasets helps algorithms learn to distinguish between normal and malicious
activities. The training process involves feeding the algorithm examples from the dataset along
with their labels, allowing the algorithm to learn the characteristics of different types of network
traffic and attacks [23].
After training, algorithms are tested and validated using portions of the dataset that were not used
during training. This step is crucial to ensure that the algorithm generalizes well to new, unseen
data. By dividing the dataset into training and testing sets, researchers can evaluate the algorithm's
performance in a controlled and systematic manner. These datasets provide a standardized way to
measure and compare the performance of different algorithms. Researchers evaluate their
algorithms using various metrics such as accuracy, precision, recall, and F1 score. By using a
common dataset, comparisons between different algorithms become meaningful and consistent
[23].
For unsupervised learning, these datasets help in developing and testing algorithms that can
identify patterns or anomalies without labeled data. Techniques such as clustering and anomaly
detection are applied to these datasets to discover insights about network traffic and identify
potential threats that do not conform to normal patterns. The datasets are also used in academic
settings to teach students about machine learning, network security, and data analysis. They
provide a practical resource for hands-on learning and experimentation, helping students
understand the challenges and techniques involved in intrusion detection and prevention [23].
Overall, datasets like KDD Cup 99, CICIDS2017, and NSL-KDD play a vital role in the
development, testing, and benchmarking of IPS and IDS algorithms. They provide a common
ground for evaluating performance, facilitating reproducibility, and enabling the simulation of real-
world scenarios, thereby advancing the field of network security research.
4.1 KDDCup99
The KDD Cup 99 dataset was created for the Third International Knowledge Discovery and Data
Mining Tools Competition, held in conjunction with the KDD-99 conference. It originates from
the 1998 DARPA Intrusion Detection Evaluation Program by MIT Lincoln Labs and has become
one of the most widely used benchmarks for evaluating the performance of intrusion detection
systems. The dataset is composed of various connection records that simulate network traffic,
making it an invaluable resource for testing and developing both supervised and unsupervised
intrusion detection algorithms [24].
The dataset includes a wide variety of attack types, categorized into four main classes: Denial of
Service (DoS), Remote to Local (R2L), User to Root (U2R), and Probe. DoS attacks aim to
overwhelm the network to disrupt services, such as SYN flood attacks. R2L attacks involve
unauthorized access from a remote machine, like password guessing. U2R attacks escalate
privileges from user to root, exemplified by buffer overflow exploits. Probe attacks involve
18
surveillance and other probing activities, such as port scanning. These categories ensure that the
dataset provides a comprehensive coverage of different intrusion scenarios [24].
Each connection record in the KDD99 dataset is described by 41 features that capture various
characteristics of the network traffic. These features include basic attributes like duration, protocol
type, service, and flag, content features derived from the payload of TCP packets (such as the
number of failed login attempts), and traffic characteristics are calculated over a period of two
seconds, such as the quantity of connections to a particular server. The dataset's rich feature set
allows researchers to analyze different aspects of network behavior and develop robust detection
methods [24] [25].
The KDD99 dataset's strengths lie in its comprehensive coverage of attack types and its established
role as a benchmark in intrusion detection research. Its wide usage in the research community
facilitates comparative studies and benchmarking, enabling researchers to measure the
performance of new algorithms against well-known results. Additionally, the diverse set of 41
features provides multiple perspectives on network traffic, aiding in the development of more
effective intrusion detection techniques [25].
However, the dataset also has notable weaknesses. It includes redundant records, which can lead
to biased results, and its age means it may not fully represent modern network traffic and attack
patterns. Despite these limitations, the KDD99 dataset remains important for IPS and IDS research
due to its historical significance and the wealth of comparative data it provides. By using this
dataset, researchers can build on a substantial foundation of prior work, advancing the field of
network security [25].
4.2 CICIDS2017
The CICIDS2017 dataset, short for Canadian Institute for Cybersecurity Intrusion Detection
Systems 2017 dataset, is a comprehensive collection of network traffic data aimed at evaluating
intrusion detection systems. It was created by the Canadian Institute for Cybersecurity (CIC) to
address the need for realistic and up-to-date datasets in the field of cybersecurity. Released in 2017,
this dataset provides a more contemporary alternative to older datasets like KDD99, reflecting the
evolving nature of cyber threats [26].
Comprising a diverse range of network traffic, the CICIDS2017 dataset encompasses various
attack scenarios, including distributed denial-of-service (DDoS), brute-force attacks, and botnet
activities, among others. It contains both benign and malicious traffic, offering a realistic
representation of network behavior in modern environments. With over 80 million instances, the
dataset includes approximately 16 different attack scenarios [26].
One notable feature of the CICIDS2017 dataset is its focus on network traffic from IoT devices,
which are increasingly becoming targets for cyber-attacks. By including traffic from IoT devices
such as IP cameras, smart TVs, and routers, the dataset reflects the growing complexity and
interconnectedness of modern networks. This emphasis on IoT traffic provides researchers with
valuable insights into the unique challenges posed by securing these devices against cyber threats.
19
In addition to its relevance and comprehensiveness, the CICIDS2017 dataset offers a significant
improvement in terms of data quality and labeling compared to previous datasets. It consists of 78
features extracted from packet headers, payload data, and flow statistics. Each network flow in the
dataset is meticulously labeled with its corresponding attack type, enabling precise evaluation of
IDS performance [27].
Overall, the CICIDS2017 dataset serves as a valuable resource for researchers, practitioners, and
educators in the field of cybersecurity. Its contemporary nature, diverse attack scenarios, and focus
on network traffic from IoT devices make it an essential tool for advancing intrusion detection
research and improving network security in the face of evolving cyber threats.
4.3 NSL-KDD
The NSL-KDD dataset, a refined version of the original KDD Cup 1999 dataset, is widely used
for evaluating IDS. It was created to address some of the limitations and shortcomings of the
original KDD99 dataset. NSL-KDD retains the diverse range of network traffic and attack
scenarios found in KDD99 while removing redundant and duplicated instances, resulting in a more
balanced and realistic dataset [23].
Containing both network traffic and various types of attacks, the NSL-KDD dataset offers
researchers a comprehensive view of intrusion scenarios. It includes four main attack categories:
Denial of Service, Remote to Local, User to Root, and Probing, with a total of 22 attack types.
This diverse range of attacks allows for thorough evaluation of IDS performance across different
threat scenarios [28].
With 41 features extracted from network traffic data, including attributes such as duration, protocol
type, service, and more, the NSL-KDD dataset provides rich information for training and testing
intrusion detection models. Furthermore, the dataset addresses the class imbalance issue present
in KDD99 by providing a more balanced distribution of normal and attack instances [28].
The NSL-KDD dataset has become a valuable resource for intrusion detection research, offering a
more realistic and up-to-date alternative to the original KDD99 dataset. Its improved data quality,
balanced distribution of instances, and diverse attack scenarios make it a preferred choice for
evaluating IDS algorithms and techniques. Researchers and practitioners alike can leverage NSL-
KDD to develop more effective intrusion detection systems and improve network security in the
face of evolving cyber threats.
20
5. Attacks
Understanding various cyber-attack types is crucial for safeguarding systems and networks against
malicious activities. This section explores different categories of cyber-attacks: Denial of Service
(DoS), Probing, Remote to Local (R2L), and User to Root (U2R). Each category targets different
aspects of system security, from disrupting services to gaining unauthorized access. By examining
the characteristics and techniques of each attack type, cybersecurity professionals can better
prepare defenses and mitigate the risks posed by cyber threats [29] [30] [31] [32]. Tables 2-5
provide some examples of the 4 different categories of cyber-attacks.
21
Attack Type Description
Probing Scanning or probing a target system to gather
information about its vulnerabilities and
weaknesses.
Port Scanning Scanning a range of ports on a target system to
determine which services are running and
potentially vulnerable to attack.
Table 4. Example of Probing Attacks
22
6. Metrics
When researchers evaluate and understand supervised and unsupervised algorithms used in
Intrusion Prevention Systems (IPS) and Intrusion Detection Systems (IDS), they rely on several
key metrics to assess performance, accuracy, and effectiveness [33]. For supervised algorithms,
accuracy is a fundamental metric. It measures the proportion of correctly classified instances out
of the total instances, offering an overall sense of the algorithm’s performance. However, accuracy
alone may not provide a complete picture, especially in cases where data is imbalanced, such as
when there are far more benign instances than malicious ones [34]. This is where precision and
recall become crucial. Precision, defined as the proportion of true positive results out of all positive
predictions, helps in understanding how many of the detected threats were actual threats. High
precision is essential for reducing false alarms, which can overwhelm security teams and
desensitize them to real threats. On the other hand, recall, or sensitivity, measures the proportion
of true positive results out of all actual positives, indicating how well the system can detect actual
threats. A high recall ensures that most, if not all, threats are detected, minimizing the risk of a
breach [35]. To balance precision and recall, researchers use the F1 score, which is the harmonic
mean of precision and recall. The F1 score provides a single metric that captures both the false
positives and false negatives, offering a more comprehensive measure of an algorithm’s
performance. Below there is explanation of the most common metric that researchers use and how
they are calculated [36].
True Positive (TP): This is the number of correctly classified positive instances. In the context of
intrusion detection, a true positive occurs when the model correctly identifies an attack.
False Positive (FP): This is the number of incorrectly classified positive instances. In intrusion
detection, a false positive occurs when the model incorrectly labels normal network traffic as an
attack.
True Negative (TN): This is the number of correctly classified negative instances. In intrusion
detection, a true negative occurs when the model correctly identifies normal network traffic as not
being an attack.
False Negative (FN): This is the number of incorrectly classified negative instances. In intrusion
detection, a false negative occurs when the model fails to detect an attack, incorrectly labeling it
as normal network traffic.
These four metrics are often used to calculate other evaluation metrics:
Accuracy: This metric evaluates the overall accuracy of the model's predictions. It is calculated as
the ratio of the total number of correct predictions (TP + TN) to the total number of predictions
made [34].
TP + TN
Accuracy = TP+FP+TN+FN
23
Recall (Sensitivity or True Positive Rate): This metric measures the proportion of actual positive
instances that were correctly identified by the model. It is determined by dividing the number of
true positives by the total of true positives and false negatives [35].
𝑇𝑃
Recall = 𝑇𝑃+𝐹𝑁
Precision (Positive Predictive Value): This metric measures the proportion of positive predictions
that were actually correct. It is determined by dividing the number of true positives by the total of
true positives and false positives [34].
𝑇𝑃
Precision = 𝑇𝑃+𝐹𝑃
F1 Score: Evaluates the performance of a classification model based on precision and recall, which
are measures of the model’s accuracy and completeness. It provides insight into how well the
model balances both precision and recall [36]. It is calculated as:
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛∗𝑅𝑒𝑐𝑎𝑙𝑙
F1Score = 2 ∗ 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙
24
7. Machine learning
Machine learning plays a transformative role in the development of Intrusion Prevention Systems
(IPS) and Intrusion Detection Systems (IDS), offering significant advancements over traditional
anomaly-based and signature-based detection methods. Its ability to learn from data and adapt to
new threats makes it a powerful tool in the ever-evolving landscape of cybersecurity [5] [6].
Traditional signature-based detection relies on predefined patterns and signatures of known attacks
to identify malicious activities. While effective against known threats, this method falls short when
faced with new, unknown attacks or variations of existing ones. Signature-based systems require
constant updates to their signature databases, leading to delays in detecting new threats.
Additionally, they struggle to identify sophisticated attacks that do not match existing signatures
[17].
Anomaly-based detection, on the other hand, identifies deviations from established normal
behavior to detect potential intrusions. This method is better at detecting unknown threats
compared to signature-based detection, as it does not rely on specific signatures. However,
anomaly-based systems often suffer from high false positive rates, flagging benign activities as
malicious due to variations in normal behavior. This can overwhelm security teams and reduce the
overall effectiveness of the detection system [18].
Machine learning enhances IPS and IDS by addressing the limitations of both anomaly-based and
signature-based detection. Supervised machine learning algorithms can be trained on labeled
datasets to recognize patterns associated with both normal and malicious activities. These
algorithms can identify complex patterns and correlations that are not apparent to traditional
methods. Once trained, they can generalize from the training data to detect new, unseen threats
with higher accuracy and lower false positive rates [5] [6].
One of the key advantages of machine learning in IPS and IDS is its ability to continuously learn
and adapt. Machine learning models can be updated with new data, allowing them to evolve with
emerging threats. This adaptive capability ensures that the detection systems remain effective over
time, even as attack strategies become more advanced. Additionally, machine learning can
automate the detection process, reducing the reliance on human intervention and allowing security
professionals to focus on more strategic tasks [37]. Machine learning also facilitates the integration
of multiple data sources, providing a more comprehensive view of network activity. By analyzing
data from various sources, such as network traffic, user behavior, and system logs, machine
learning models can achieve a deeper understanding of potential threats. This holistic approach
enhances the accuracy and reliability of intrusion detection systems [38].
25
7.1 Machine Learning Classification of Problems
The aim of ML is to execute complex tasks autonomously, without human intervention, thereby
advancing in sophistication through continual learning from experiences. This ability enables ML
to tackle intricate problems, including those characterized by high complexity. Before taking
action to solve a problem, ML identifies the problem and places it in the right problem category.
ML has five categories where the problem is placed [39]. Figure 4 depicts the working process of
Machine Learning.
1. Classification Problem: categorizes data into specific groups, and these groups are already
known beforehand like YES or NO
2. Anomaly Detection Problem: examines patterns and identifies variations or irregularities
within those patterns are categorized as anomaly detection problems. These problems
involve identifying outliers or abnormalities in data patterns.
3. Regression Problem: is utilized for inquiries related to quantity employed to address issues
involving continuous and numerical outputs.
4. Clustering Problem: belongs to the realm of unsupervised learning techniques. Clustering
algorithms aim to discern patterns within data and create clusters based on similarities in
the data's structure. Each cluster represents a distinct group, and the algorithm assigns new,
unseen data points to appropriate clusters based on their characteristics.
5. Reinforcement Problem: employed when decisions rely on past learning experiences.
Through trial-and-error interactions with a dynamic environment, the machine agent learns
behaviors. This approach allows programming agents using the concepts of rewards and
penalties, without explicitly defining the task execution methods.
1. Collections and Preparation of Data: The first step involves gathering and organizing the
data. Once collected, the data need to be formatted to make them understandable for the
algorithms. This preparation step involves cleaning the data, removing any extraneous or
irrelevant information. Web data often arrives in disorganized formats with unnecessary
details.
2. Feature Selection: The data that has been collected from the previous step may include
many features of which might not be essential for the learning process. It's necessary to
filter out these irrelevant features and select only the most significant ones for further
analysis.
3. Choice of Algorithm: Different machine learning algorithms are tailored to different types
of problems. Each algorithm has its strengths and weaknesses, making some more suitable
for certain types of tasks than others. Therefore, selecting the most appropriate machine
learning algorithm for a particular problem is essential to achieve the most effective results.
26
4. Selection of Models and Parameters: The majority of machine learning algorithms typically
need some initial manual adjustment to determine the best values for different parameters.
5. Training: Once the suitable algorithm and parameter values have been chosen, the model
must undergo training using a portion of the dataset designated as training data.
6. Performance Evaluation: Before deploying the system in real-time, it's crucial to test the
model with unseen data to assess its learning progress. This evaluation involves measuring
performance using metrics such as accuracy, precision, and recall.
27
8. Supervised Learning
28
Aspect Classification Regression
Estimation of a continuous
Categorization of instances value (e.g., likelihood score,
into predefined classes (e.g., anomaly score) representing
normal vs. malicious) based on the degree of abnormality or
features extracted from risk associated with observed
Nature of Problem network traffic or system logs. behavior.
Discrete class labels indicating Continuous numerical values
the predicted intrusion representing the severity or
category or class (e.g., DoS likelihood of an intrusion
Output attack, SQL injection). event.
Supervised learning using Supervised learning using
labeled datasets where labeled datasets with
instances are associated with annotated intrusion severity or
Training Approach predefined intrusion classes. risk scores.
Decision boundaries Regression functions mapping
separating different intrusion input features to a continuous
classes in the feature space, output space, often
typically represented by represented by linear
decision trees, support vector regression, decision trees, or
Model Representation machines, or neural networks. ensemble methods.
Evaluation metrics include
mean absolute error (MAE),
Metrics such as accuracy, root mean squared error
precision, recall, F1-score, and (RMSE), and correlation
confusion matrices are coefficients, assessing the
commonly used to evaluate accuracy of predicted intrusion
Evaluation Metrics classification performance. severity or risk scores.
Regression is commonly
Classification is primarily used employed in IPS to estimate
in IDS to identify and the severity or risk associated
categorize different types of with detected intrusions,
intrusions based on observed enabling prioritization of
Application in IDS/IPS network or system behavior. response actions.
Regression models can be
Classifiers can be updated dynamically adjusted based on
based on new labeled data to feedback from the network
adapt to evolving attack environment to reflect changes
patterns or changes in network in intrusion severity or risk
Real-time Adaptation behavior. levels.
Techniques such as resampling, Imbalanced regression data
cost-sensitive learning, or may require specialized
ensemble methods can techniques such as robust loss
Handling Imbalanced Data address imbalanced class functions or outlier detection
29
distributions commonly to mitigate the impact of
encountered in intrusion skewed target distributions.
detection datasets.
Regression models need to
Classifiers may require efficient handle large datasets
feature selection and efficiently while maintaining
dimensionality reduction computational scalability for
techniques to scale to large- real-time intrusion severity
scale network environments estimation in high-throughput
Scalability Considerations with high traffic volumes. network environments.
Table 6. Different aspect of classification and regression
30
Data preprocessing follows, where raw data collected from the network undergoes preprocessing,
which includes tasks such as normalization, feature extraction, and transformation. This step
prepares the data for analysis by converting it into a format suitable for use with machine learning
algorithms like decision trees. The preprocessed data is used to train a decision tree model. During
training, the decision tree algorithm learns to recognize patterns and relationships within the data
that are indicative of different types of attacks. The NSL-KDD dataset, with its labeled instances
of normal and attack traffic, serves as a valuable source for training the decision tree model [46].
Later feature selection techniques are applied to the dataset to identify the most relevant features
for detecting intrusions. This helps improve the efficiency and accuracy of the decision tree model
by focusing on the most informative attributes. Once trained, the decision tree model is deployed
within the IDS or IPS. As new network traffic data flows through the system, the decision tree
analyzes the features extracted from each data instance and classifies it as either normal or
malicious based on the learned patterns. If an instance is classified as malicious, it triggers an alert
or takes preventive actions in the case of an IPS [48].
The performance of the decision tree-based IDS/IPS is continuously evaluated using metrics such
as accuracy, precision, recall, and false positive rate. Feedback from the evaluation process may
be used to refine the decision tree model further and improve its ability to detect and classify
attacks accurately. By leveraging decision trees trained on datasets like NSL-KDD, IDS and IPS
can effectively identify a wide range of attacks, including Denial of Service (DoS), Probe,
Unauthorized Access (R2L and U2R), and other types of malicious activities, thereby enhancing
the security posture of the network infrastructure [44]. This Figure 5 illustrates the overall process
of implementing decision trees within Intrusion Detection Systems (IDS) and Intrusion Prevention
Systems (IPS). It provides a detailed visual representation of how decision tree algorithms are
applied to analyze network traffic and identify potential security threats.
31
8.2 Random Forest
Random Forest is a type of machine learning algorithm adept at handling both classification and
regression tasks. It operates on the principle of ensemble learning, where numerous decision trees
are trained and combined in order to make predictions. Decision trees are like flowcharts that
navigate through data features by posing a sequence of inquiries. At each junction, the data
diverges into smaller subsets based on the answers obtained [49].
Breaking away from the sole dependence on a lone decision tree, Random Forest orchestrates an
assembly of these trees. Each tree undergoes training on a randomized subset of data and features,
introducing variability that effectively combats overfitting and bolsters the model's robustness.
Random Forest generates numerous classification trees, with each tree developed using a bootstrap
sample distinct from the original dataset. This process utilizes a tree classification algorithm. Once
the forest is constructed, a novel object is introduced for classification, requiring categorization.
This classification task is executed on every tree within the forest [50]. When it's time to make
predictions, each tree in the forest casts a vote. In classification tasks, like intrusion detection, the
final prediction is based on what most of the decision trees agree upon. In the realm of Intrusion
Detection Systems, this means the decision that most trees support is the one that is chosen.
Moreover, Random Forest reveals the paramount features pivotal for accurate predictions. It
measures how much each feature decreases the impurity or increases the information gain when
it's used in a decision. This feature importance analysis enlightens us on the data facets most
instrumental in detecting intrusions, presenting valuable insights for effective threat identification
[49].
The benefits of the Random Forest algorithm include a reduction in model variance as the number
of decision trees increases, thereby mitigating bias. Furthermore, the Random Forest classifier
exhibits resistance to overfitting, eliminating the necessity for feature selection due to the extensive
utilization of featured attributes. Nevertheless, drawbacks of this classifier include dependency on
random generation, low model interpretability, and performance degradation attributed to
correlated variables [51]. Figure 6 depicts the process of utilizing the Random Forest algorithm.
32
Figure 6. Process of Random Forest
The first crucial step in deploying Random Forest for intrusion detection is the preparation of input
data. This involves gathering and preprocessing a dataset containing pertinent information about
network traffic, including packet headers, protocols, source and destination addresses, and other
relevant attributes. Subsequently, the dataset is divided into distinct subsets for training and testing
purposes, ensuring the integrity of the evaluation process [52].
Following data preparation, the Random Forest algorithm is initialized with appropriate
parameters. Key considerations include the determination of the number of decision trees
(n_estimators) comprising the forest, as well as setting hyperparameters such as the maximum
depth of each tree and the number of features considered for each split. These parameters play a
crucial role in shaping the performance and behavior of the Random Forest model [52].
With the Random Forest initialized, the training phase commences. For each decision tree within
the forest, a process of iterative training ensues. This involves the generation of bootstrap samples
from the training dataset, wherein data points are randomly sampled with replacement.
Additionally, a subset of features is randomly selected for consideration during the training
33
process. Through this iterative procedure, each decision tree is trained on a unique subset of data
and features, imparting variability to the ensemble [53].
As training concludes and the Random Forest is ready for deployment, the algorithm employs a
sophisticated voting mechanism to aggregate predictions from individual decision trees. When
faced with the task of classifying new instances, each decision tree within the forest independently
provides a prediction. Subsequently, the final classification is determined through a process of
majority voting, wherein the class predicted by the majority of decision trees is selected as the
final output. This collective decision-making process enhances the robustness and reliability of the
intrusion detection system [51] [52].
With the voting mechanism employed, the Random Forest model is ready to classify new instances
of network traffic. The trained model is applied to the testing dataset, and predictions are generated
for each instance. Subsequently, the performance of the model is evaluated using established
metrics such as accuracy, precision, recall, and F1-score. These metrics provide valuable insights
into the efficacy of the intrusion detection system and facilitate informed decision-making
regarding system refinement and optimization [53].
In addition to prediction and evaluation, Random Forest facilitates feature importance analysis,
shedding light on the attributes that significantly contribute to the detection of intrusions. By
quantifying the impact of each feature on the classification process, the algorithm elucidates the
underlying patterns and characteristics of malicious network activity. This analysis enables
cybersecurity practitioners to prioritize and focus on the most influential features, thereby
enhancing the efficacy and efficiency of intrusion detection systems [50].
Finally, the Random Forest algorithm produces actionable outputs in the form of final predictions
and performance metrics. These outputs provide valuable insights into the performance and
effectiveness of the intrusion detection system, empowering stakeholders to make informed
decisions regarding network security measures. Furthermore, the insights gleaned from feature
importance analysis inform ongoing efforts to refine and optimize the intrusion detection
framework, ensuring continued resilience against evolving cybersecurity threats.
A Support Vector Machine (SVM) belongs to the realm of supervised machine learning, grounded
in statistical learning theory. Its primary function is to sort data into distinct groups or categories.
During the training phase, an SVM scrutinizes a collection of labeled examples, termed support
vectors. These examples serve as reference points for the SVM's task: to delineate a hyperplane
that effectively segregates the various data groups. The beauty of SVMs lies in their adeptness at
determining the optimal line or shape that creates separation between these groups, known as the
margin. This margin plays a crucial role in enabling the SVM to make precise predictions when
presented with new, unseen data. Linear SVMs are specifically suited for situations where data
separation can be achieved with a straight line. They accomplish this by pinpointing a subset of
training data points, referred to as support vectors, to establish these boundary lines. This selective
focus streamlines the process, making it more efficient and rapid, as it concentrates solely on the
most pertinent data points for classification [53].
34
Moreover, Support Vector Machines find utility in classifying both linear and non-linear data.
Figure 3 depict the classification process of data between linear and non-linear. When handling
non-linear data, SVM classifiers use kernel functions. These functions aim to elevate the data into
higher dimensions to enhance classification accuracy. Various kernel functions, including the
polynomial kernel, Radial Basis Function (RBF) kernel, sigmoid kernel, among others, are
employed in SVM classification [55] [56]. Figure 7 depicts the difference between linear and non-
linear SVM.
The first step that SVM takes involves the collection of comprehensive network traffic data,
including various types of attacks such as Denial of Service, Distributed Denial of Service, port
scanning, SQL injection, malware propagation, and other forms of malicious activities. This
dataset encompasses various types of attacks and normal network behavior to ensure the robustness
and effectiveness of the Support Vector Machine model [57].
One the data have been collected feature extraction comes to transform raw network traffic data
into a format suitable for machine learning algorithms. Features may include packet headers,
payload content, source and destination IP addresses, source and destination ports, packet size
distributions, and temporal patterns. These features provide rich information about the network
traffic and enable the SVM model to discern patterns associated with various types of attacks [58].
Prior to training the SVM model, the extracted features undergo preprocessing steps to ensure data
quality and consistency. This may involve tasks such as normalization to scale features within a
consistent range, handling missing values through imputation techniques, and encoding categorical
variables using methods like one-hot encoding. Data preprocessing enhances the robustness of the
SVM model by mitigating the effects of noise and inconsistencies in the data [55].
The labeled dataset, comprising instances of both benign and malicious network traffic, is prepared
for training the SVM model. Each instance in the dataset is annotated with its corresponding class
label, indicating whether it represents normal behavior or malicious activity. The dataset
encompasses a diverse range of attacks, including but not limited to DoS attacks, DDoS attacks,
port scans, brute force attacks, and malware infections [57].
The SVM algorithm is trained using the prepared dataset to learn the underlying patterns that
distinguish between benign and malicious network traffic. During training, the SVM seeks to find
the optimal hyperplane that maximizes the margin between different classes while minimizing
classification errors. By iteratively adjusting the hyperplane based on the training examples, the
SVM effectively learns to differentiate between normal and anomalous network behavior. [59]
35
After training, the performance of the SVM model is evaluated using a separate test dataset that
was not utilized during training. This evaluation assesses various performance metrics such as
accuracy, precision, recall, and F1-score to gauge the model's effectiveness in identifying
malicious traffic while minimizing false positives. Additionally, techniques such as cross-
validation may be employed to ensure the robustness and generalization capability of the SVM
model across different datasets and network environments [58] [59].
Upon successful evaluation, the trained SVM model is deployed within an IDS or IPS for real-
time detection of malicious traffic. Incoming network packets and flows are subjected to the SVM
model, which analyzes their features and classifies them as either benign or malicious based on
the learned patterns. Detected malicious activities trigger appropriate responses, such as alert
generation, traffic blocking, or mitigation measures to prevent further damage to the network
infrastructure [55].
To adapt to evolving threats and changes in network traffic patterns, the SVM model may require
periodic updates. This involves retraining the model with fresh data collected from the network to
incorporate new attack vectors, emerging threats, and evolving patterns of malicious behavior. By
continuously updating the SVM model, organizations can enhance their network security posture
and effectively mitigate the risks posed by sophisticated cyber-attacks [54]. Figure 8 outlines the
process of implementing a Support Vector Machine (SVM).
36
Figure 8. Process of Support Vector Machine
37
8.4 Naïve Bayes.
Naive Bayes is a probabilistic classification algorithm based on Bayes' theorem with the
assumption of feature independence. While Naive Bayes is more commonly associated with tasks
like text classification or spam filtering, it can also be adapted for use in IDS and IPS with certain
considerations. Naive Bayes operates on the principles of Bayes' Theorem and conditional
probability. It calculates the probability of a given instance belonging to a particular class based
on the presence of certain features or attributes. In the context of IPS and IDS, these features may
include network traffic characteristics such as packet headers, source and destination addresses,
protocol types, etc. [59].
A key assumption of Naive Bayes is the independence of features, meaning that the presence of
one feature is assumed to be unrelated to the presence of any other feature. While this assumption
may not hold true in practice for all datasets, it simplifies the calculation of probabilities and makes
the algorithm computationally efficient [59].
During the training phase, Naive Bayes analyzes the historical dataset of network traffic incidents
or intrusions. It calculates the probabilities of each feature occurring given each class (normal
traffic or intrusion). These probabilities are estimated using the frequencies of feature occurrences
in the training data. When a new instance of network traffic is encountered, Naive Bayes calculates
the probability of the instance belonging to each class. It does this by multiplying the probabilities
of each feature occurring given the class. The category that has the greatest likelihood is
subsequently assigned to the instance [60] [61].
The decision rule used by Naive Bayes is to assign the class label that maximizes the posterior
probability, which is calculated using Bayes' Theorem. In the context of IPS and IDS, this decision
rule helps identify whether the incoming network traffic is benign or malicious. After the model is
trained, its performance is evaluated using metrics such as accuracy, precision, recall, and F1-
score. These metrics measure the algorithm's ability to correctly classify instances of network
traffic and detect intrusions while minimizing false positives and false negatives [62].
One of the advantages of Naive Bayes for IPS and IDS is its ability to adapt to changing network
conditions and detect new types of intrusions in real-time. As new instances of network traffic are
encountered, the model can quickly update its probability estimates and make decisions on the fly.
Bayes' Theorem is a probability theory that describes the connection between conditional
probabilities. It's often used in machine learning and statistics for tasks like classification,
hypothesis testing, and Bayesian inference [63]. Bayes' Theorem is formulated as follows [59]:
𝑃(𝐵|𝐴)×P(A)
𝑃(𝐴|𝐵) = 𝑃(𝐵)
P(A∣B): This represents the probability of event A occurring given that event B has already
occurred. In the context of IPS and IDS, this could represent the probability of a network traffic
instance being classified as an intrusion (A) given certain observed features or attributes (B).
38
P(B∣A): This represents the probability of event B occurring given that event A has already
occurred. In the context of IPS and IDS, this could represent the probability of observing certain
features or attributes (B) given that a network traffic instance is classified as an intrusion (A).
P(A) and P(B): P(A) represents the prior probability of event A occurring, independent of any
other events. In the context of IPS and IDS, this could represent the overall probability of a network
traffic instance being an intrusion. Similarly, P(B) represents the prior probability of event B
occurring, independent of any other events. This could represent the overall probability of
observing certain features or attributes in the network traffic.
Bayes' Theorem allows us to calculate the conditional probability P(A∣B) based on the conditional
probability P(B∣A), the prior probability P(A) and the prior probability P(B). By combining these
probabilities, we can update our beliefs about the likelihood of event A occurring given the
observed evidence B [64].
The writer's proposed methods involve a two-phase approach for intrusion detection. In the first
phase, they segment data features by type and preprocess them using Naive Bayes for
classification. This includes calculating class probabilities and applying weights based on attack
class distribution. In the second phase, they address misclassified attacks by using additional
methods like Linear Discriminant Analysis (LDA) and Elliptic Envelope classification. Naive
Bayes is crucial in categorizing behavior resembling normal activity. Their approach outperforms
other algorithms in accuracy and recall rates, especially in the NSL-KDD dataset. Naive Bayes is
favored for its efficiency with large datasets, compatibility with various data types, and ability to
assume feature independence. Its success in text categorization tasks further supports its suitability
for intrusion detection [61]
Based on the provided texts, the researchers conducted a comprehensive investigation into the
effectiveness of Naïve Bayes classifiers for intrusion detection. Their findings revealed that Naïve
Bayes classifiers exhibited improved performance when discretization techniques were applied, as
compared to when they were not. Utilizing datasets such as NSLKDD and DARPA, the researchers
meticulously analyzed the impact of discretization on the classifiers' performance. Through
rigorous experimentation and evaluation using metrics such as accuracy, True Positive (TP), False
Positive (FP), and design time, the researchers demonstrated notable enhancements in the proposed
model's performance. Notably, the proposed model achieved an accuracy of 97.13% for the NSL-
KDD dataset and 99.83% for the DARPA dataset, outperforming alternative approaches. These
findings underscore the efficacy of Naïve Bayes classifiers, particularly when coupled with
discretization, in the realm of intrusion detection [65].
39
9. Unsupervised Learning
Unsupervised learning techniques operate independently during the training phase, making them
particularly well-suited for discerning patterns, groupings, and disparities within unstructured data.
This approach proves valuable in tasks like customer segmentation, exploratory data analysis, and
image recognition. Without the need for external guidance, unsupervised learning algorithms
effectively classify, label, and cluster data points within datasets, autonomously uncovering
patterns. Essentially, unsupervised learning empowers systems to discern underlying structures
within datasets without predefined categories. In this method, AI systems organize unsorted
information based on similarities and differences, devoid of pre-existing labels or categories [66].
Unsupervised learning often leverages generative learning models, although it may also adopt a
retrieval-based strategy commonly associated with supervised learning. The process commences
as machine learning engineers or data scientists feed datasets through algorithms for training.
These datasets lack labels or predefined categories; each data point serves as an unlabeled input
object or sample. The primary objective of unsupervised learning is for algorithms to identify
inherent patterns within training datasets and categorize input objects based on these discovered
patterns. Through analyzing the intrinsic structure of datasets, algorithms extract meaningful
features, enabling the identification of relationships among each sample or input object [66].
These methods typically profile standard behavior, identifying any significant deviations as
potentially malicious. Unlike supervised techniques, unsupervised models do not rely on labeled
datasets, eliminating the need for human experts to designate packets or flows as malicious. The
primary advantage of unsupervised approaches lies in their capability to uncover zero-day attacks
by detecting abnormalities in normal activity, potentially enhancing their resilience to future
threats [67].
During the training phase, unsupervised algorithms only require unlabeled data. However, labels
are essential for evaluating the effectiveness of the models proposed. Acquiring these labels can
prove challenging, as experts must manually inspect and classify data points based on their
expertise. Although an evaluation set is necessary during the research phase, this labor-intensive
process can be minimized [67].
Frequently, these methods are not employed directly on the original network packets. Instead, the
traffic undergoes aggregation, forming bidirectional flows of packets between the same source and
destination. Within these bidirectional flows, various statistical features are computed in both
forward and backward directions. For instance, these features may include packet count, duration,
and average packet length. Converting raw network traffic into flows offers several advantages. It
allows for a more comprehensive overview of the current situation since it considers the entirety
of the flow rather than individual packets. Moreover, this transformation significantly reduces the
volume of data transmitted, particularly beneficial for centralized fraud detection systems, thereby
minimizing bandwidth requirements. Unsupervised learning is ideal for clustering, anomaly
detection, association mining, and dimensionality reduction as shown in Figure 9 [68].
40
Figure 9. Key techniques for unsupervised learning
9.1 Clustering
Clustering within unsupervised learning entails grouping similar data points together grounded on
their attributes or traits, without the necessity for labeled data. The objective of clustering is to
uncover innate structures, patterns, or correlations within the data, facilitating insights into the
fundamental data distribution. Clustering methodologies can be harnessed to discern anomalous
conduct in network traffic. By clustering standard network traffic patterns, any deviation from
these patterns can be identified as anomalies, aiding in spotting potential breaches or infringements
in network security [69].
Moreover, clustering algorithms can partition network traffic into diverse clusters based on
similarities in their attributes, such as source IP addresses, destination ports, or packet sizes. This
partitioning can aid in comprehending the varied types of network activities and potentially
distinguishing between authorized and unauthorized traffic. Clustering can also aid in formulating
regulations or signatures for IDS/IPS systems. By clustering similar network traffic instances,
common traits of attacks can be discerned, which can then be utilized to devise regulations for
detecting and forestalling akin attacks in the future [70].
41
As an unsupervised strategy, a clustering technique aims to cluster data according to a likeness
measure. The aim of clustering is to achieve high intra-cluster likeness (i.e., data within a cluster
are similar) and low inter-cluster likeness (i.e., data from distinct clusters are dissimilar). Two
widely recognized clustering methodologies are the K-means and DBSCAN approaches. K-means
is a partitioned-based clustering algorithm that produces sphere-like clusters, while DBSCAN is a
density-based clustering algorithm that yields arbitrarily shaped clusters. Each possesses its
advantages and is suitable for diverse situations, such as the scale of the dataset or the form of the
clusters [71].
42
The writers utilized Principal Components Analysis as a feature extraction method in the context
of intrusion detection on IoT networks. They implemented PCA to reduce the dimensionality of
the dataset obtained from packet capture and preprocessing stages. The goal was to extract relevant
features and transform the data into a lower-dimensional structure while preserving as much
variation as possible. By applying PCA, the authors aimed to improve the accuracy and precision
of attack detection on IoT networks. They concluded that PCA significantly improved the accuracy
and precision of attack detection, reaching up to 100% accuracy in various combinations of
training and testing data allocations. They also identified potential future works, such as involving
more attack types and complex IoT network topologies, to further enhance the effectiveness of
PCA in intrusion detection systems [75].
While Principal Components Analysis offers significant advantages in dimensionality reduction
and feature extraction for IPS and IDS, it also comes with certain disadvantages. One limitation is
that PCA operates under the assumption of linear relationships between variables, which may not
always hold true in complex network environments. This can lead to information loss or distortion
when the underlying data exhibits nonlinear relationships. Additionally, PCA is sensitive to
outliers in the data, which can affect the accuracy of the principal components extracted. Moreover,
PCA does not inherently account for the class separation between normal and malicious network
traffic, potentially leading to misclassification of attacks or anomalies. Furthermore, PCA requires
careful parameter tuning, such as the selection of the number of principal components, which can
be challenging and may vary depending on the dataset and application. Overall, while PCA offers
valuable insights and dimensionality reduction capabilities, it is essential to consider its limitations
and potential drawbacks when applying it to IPS and IDS [76].
9.3 Isolation-Forest
Isolation trees, also known as iTrees, are described as binary trees utilized to identify anomalies
within a dataset. Each node within these trees is classified as either an interior node, featuring two
daughter nodes and a split test, or an exterior node, lacking children. A split test comprises an
attribute and a split value, which partition the data into distinct subsets. The isolation process
involves recursively breaking down the dataset until specific criteria are met [77].
Moving on to the isolation forest algorithm, it combines the utilization of isolation trees to identify
outliers with shorter paths. This method leverages the expertise of trees to distinguish outliers
effectively. Unlike traditional approaches that isolate every occurrence of normal points, isolation
forest constructs partial models with a limited sample size, thus facilitating efficient anomaly
detection. Notably, the algorithm is trained on normal samples but allows for the inclusion of a
few anomalous instances. This characteristic renders the algorithm versatile and adaptable to
datasets that may not be meticulously organized, provided the contamination parameter is
appropriately tuned [77].
Isolation Forest, a tree-based anomaly detection algorithm, operates under the principle of isolating
anomalies within a dataset, leveraging the notion that outliers are typically sparse and distant from
the bulk of normal instances. By constructing a series of trees and iteratively partitioning the data,
43
each instance is isolated within a leaf node, facilitating the identification of anomalies. Unlike
density-based approaches such as one-class SVM, Isolation Forest focuses on isolating outliers
within a tree structure rather than analyzing normal points [78][79]. This algorithm creates an
ensemble of random trees for a given dataset, with anomalies identified as points within the trees
possessing the shortest average path lengths. Particularly effective in scenarios marked by
significant class imbalance and scattered data points, Isolation Forest proves advantageous when
disparate feature spaces are present, as outliers tend to exhibit shallower depths within the tree
structure compared to normal instances, making them easier to differentiate. Careful selection of
parameters such as n_estimators and contamination are crucial when building an outlier detection
model using the Isolation Forest algorithm [79].
By employing Isolation Forest in IPS and IDS, security professionals can leverage its ability to
discern abnormal network traffic patterns indicative of potential attacks or intrusions. The
algorithm's capacity to isolate anomalies within the dataset facilitates the early detection of
suspicious activities, enabling timely response and mitigation measures. Furthermore, Isolation
Forest's tree-based nature allows it to adapt to varying feature spaces, making it suitable for
analyzing diverse network environments.
In practical implementation, Isolation Forest offers several advantages for IPS and IDS
applications. Its efficiency and scalability make it well-suited for processing large-scale network
traffic data in real-time, crucial for rapid threat identification and response. Moreover, Isolation
Forest does not require labeled data for training, enabling it to function effectively in unsupervised
intrusion detection scenarios. By carefully selecting parameters such as the number of trees
(n_estimators) and contamination threshold, security practitioners can tailor Isolation Forest to suit
the specific requirements of their IPS and IDS deployments, maximizing its effectiveness in
identifying and mitigating security threats [79].
The writers conducted experiments to compare the performance of two machine learning
approaches, the Isolation Forest Model and Support Vector Machine, for anomaly-based intrusion
detection. They utilized the WEKA tool and the NSL-KDD dataset to simulate their experiments.
Performance metrics such as Accuracy, Recall, and F-Score were estimated to evaluate the
effectiveness of each approach. The results showed that the Isolation Forest Model achieved a
higher accuracy rate of 0.99 compared to SVM, which had an accuracy rate of 0.95. However,
SVM exhibited slightly higher recall at 0.88 compared to the Isolation Forest Model's recall of
0.87. In their conclusion, the writers emphasized the effectiveness of anomaly-based machine
learning models over conventional methods, highlighting their speed, reliability, and high accuracy
rates, particularly suitable for handling large datasets with minimal memory requirements. They
suggested that the accuracy of machine learning models depends heavily on feature selection and
dimensionality of features and advocated for a hybrid approach in designing intrusion detection
systems to address the increasing rate of various types of anomalies in the cyber world [80].
Isolation forest, despite its effectiveness in anomaly detection, does come with some drawbacks
when applied to intrusion detection scenarios. Firstly, one notable limitation is its difficulty in
effectively detecting specific types of outliers. In certain cases, especially when outliers exhibit
scores close to zero, isolation forest may struggle to differentiate them effectively. This ambiguity
in outlier detection can pose challenges in setting appropriate thresholds for classifying anomalies,
particularly without a clear understanding of the types of outliers present in the dataset [81].
44
Another disadvantage of isolation forest for intrusion detection is its limited interpretability. The
decision-making process of the algorithm can be complex and may lack transparency, making it
challenging for non-experts to interpret and understand. This lack of interpretability can hinder the
ability to derive actionable insights from the detection results, potentially leading to difficulties in
identifying and addressing security threats effectively.
The one-class Support Vector Machine (SVM), often abbreviated as OCSVM, focuses on training
anomaly detection models using only samples from a single class. This means it operates under
the assumption that all training instances belong to the same category. This approach is particularly
useful in tasks like intrusion detection, where the primary aim is to spot instances that diverge from
the norm. OCSVM offers several advantages, including quicker computation time and less data
requirements. Moreover, it tends to produce more precise models and exhibits resilience to noisy
samples. In the realm of communication network intrusion detection systems, one-class SVM has
consistently shown its effectiveness as a valuable tool in the field of machine learning [82][83].
One-class Support Vector Machine is paramount for effectively identifying and mitigating
potential security threats within network traffic. The process begins with the collection of network
traffic data from various sources, such as routers, switches, and firewalls, encompassing critical
information like source and destination IP addresses, port numbers, packet sizes, and protocols.
Subsequently, the raw network traffic data undergoes feature extraction, a pivotal step in
transforming the data into a format suitable for machine learning algorithms. This involves
extracting relevant features, such as statistical measures or frequency counts of network events, to
capture significant patterns in the network traffic [84].
Once the features are extracted, datasets are prepared for training and testing the one-class SVM,
typically comprising labeled examples of normal (benign) network traffic. During dataset
preparation, periods or segments of anomaly-free network traffic are labeled as normal, while any
known attack patterns or anomalies are appropriately labeled. The one-class SVM is then trained
exclusively on the normal network traffic data, learning to delineate the boundaries of normal
behavior in the feature space. The objective is to create a robust model that accurately characterizes
normal network traffic while effectively excluding anomalies [85].
Following training, the one-class SVM is deployed within the IPS or IDS system for real-time
monitoring of incoming network traffic. As new network traffic instances are observed, they
undergo feature extraction to transform the raw data into feature vectors. These feature vectors are
subsequently evaluated by the trained one-class SVM, which assigns a score to each instance
indicating its deviation from the learned normal behavior. A predefined threshold is applied to the
SVM scores to differentiate between normal and anomalous instances. Instances surpassing the
threshold are flagged as potential anomalies [84].
In response to detected anomalies, the IPS system may initiate preventive actions, such as blocking
suspicious traffic, whereas the IDS system typically generates alerts or notifications for further
investigation by security personnel. Throughout the process, the performance of the one-class
SVM model is continuously assessed and refined over time, incorporating feedback from detected
45
anomalies, false positives, and false negatives to enhance its accuracy and effectiveness in
detecting security threats within network traffic. Through iterative improvement and adaptation,
the one-class SVM serves as a vital component in safeguarding network infrastructure against
potential security breaches and vulnerabilities [86].
The primary challenge facing OC-SVM lies in its ability to effectively handle large and high-
dimensional datasets, primarily due to inefficiencies in feature representation and optimization
complexity. Consequently, OC-SVM may not be well-suited for applications involving big data
and high-dimensional anomaly detection [85]. Additionally, it's essential to recognize that
OCSVM operates under the assumption of a single, clearly defined normal class, which may not
always align with the intricacies of real-world data. Hence, careful selection of hyperparameters
and thoughtful consideration of the problem domain are imperative to achieve optimal
performance when employing OCSVM [86].
9.5 Auto-Encoder
An autoencoder is a neural network architecture crafted to efficiently compress data to its essential
features, then reconstruct the original input from this compressed form. It emerges as a captivating
subset of neural networks, providing a unique approach to unsupervised learning and adapting well
to the ever-evolving realm of deep learning. With their knack for learning effective data
representations, autoencoders have garnered significant attention and find utility across various
domains, from image processing to anomaly detection. They are a specialized breed of algorithms
proficient in learning efficient data representations sans labels, making them valuable in
unsupervised learning scenarios [87]. The fundamental principle of an autoencoder revolves
around learning to compress and represent input data effectively without explicit labels. This is
facilitated through a dual structure comprising an input layer, hidden layers (including a bottleneck
layer), and an output layer. The encoder condenses the input data into a lower-dimensional
representation, often termed as "latent space" or "encoding," from which the decoder reconstructs
the initial input. This iterative process of encoding and decoding enables the network to discern
meaningful patterns in the data, defining essential features along the way [88]. Figure 10 illustrates
the architecture of an auto-encoder, a type of neural network used for unsupervised learning.
46
Figure 10: Auto-Encoder Architecture
An anomaly detection system based on autoencoders, trained solely on normal traffic data,
endeavors to reproduce any given input as faithfully as feasible to the learned normal patterns.
Consequently, we can identify an input instance as a potential attack if its reconstruction error
surpasses a predetermined threshold; contrarily, the input instance can be categorized as normal.
Through this approach, an autoencoder-based anomaly detection system exhibits the capability to
discern unknown attack types whenever their patterns diverge from the established normal
patterns. However, existing methodologies encounter certain shortcomings, suggesting scope for
further enhancements. One notable limitation in current approaches involves treating the
reconstruction error as a singular value. In prevailing methodologies, the reconstruction error
across all input vector elements or features is aggregated into a single value. Given that threshold
determination relies heavily on the reconstruction error, the classification based on this threshold
proves to be suboptimal [89].
Data collection for autoencoder-based anomaly detection in IPS/IDS systems involves gathering
network traffic data, which includes details like source and destination IP addresses, port numbers,
protocol types (e.g., TCP, UDP), packet sizes, and timestamps. These datasets are then curated to
create labeled examples of both normal and anomalous network traffic, essential for training and
evaluating the performance of anomaly detection models. Once the data is collected, preprocessing
steps are undertaken to prepare it for training. This includes feature extraction, where relevant
features are derived from the raw network traffic data. Additionally, normalization techniques are
applied to ensure that the extracted features have consistent scales and distributions, facilitating
effective training [90].
Model development entails selecting or designing an appropriate autoencoder architecture for
anomaly detection in network traffic. This architecture typically comprises an encoder, a
bottleneck layer, and a decoder. During the training phase, the autoencoder learns to reconstruct
normal network traffic accurately while minimizing the reconstruction error for anomalous traffic.
Training the autoencoder involves feeding the normalized features extracted from the network
47
traffic data into the input layer. Through forward propagation, the input data traverses the encoder
layers, gradually reducing dimensionality until it reaches the bottleneck layer. Subsequently, the
reconstruction error, calculated as the difference between the input data and the reconstructed
output, is minimized using backpropagation algorithms like gradient descent [88][89].
Evaluation of the trained autoencoder model involves assessing its performance using metrics such
as accuracy, precision, recall, and F1 score on a separate test dataset. Based on the evaluation
results, a threshold for the reconstruction error is determined. Instances with reconstruction errors
above this threshold are classified as anomalies. Finally, the trained autoencoder-based anomaly
detection model is integrated into IPS/IDS systems for real-time monitoring of incoming network
traffic. The deployed model continuously analyzes traffic, identifying anomalies based on their
reconstruction errors. When anomalies are detected, the IPS/IDS system generates alerts or takes
automated actions to mitigate potential threats, thus enhancing the security posture of the network
[87][91].
48
10. Discussion
The Discussion section delves into the pivotal role of AI in bolstering the capabilities both
Intrusion Detection Systems (IDS) and Intrusion Prevention Systems (IPS). By leveraging AI
techniques, such as supervised and unsupervised learning algorithms, IDS and IPS can achieve
greater effectiveness and efficiency in identifying and responding to security threats.
AI algorithms can analyze vast amounts of data with greater speed and accuracy than traditional
rule-based approaches. Supervised learning algorithms, for instance, can learn from labeled data
to detect known threats with high precision. According to Faruq et al., their research findings
indicate that Random Forest outperforms other algorithms in terms of both accuracy and
processing time, achieving a 98.70% accuracy rate. Other classifiers, such as Naïve Bayes, MLP,
SVM, and KNN, achieve accuracy rates of 80.73%, 92.03%, 95.2%, and 97.31% respectively.
These results suggest that Random Forest can enhance DDoS attack detection and contribute to
cyber defense preparedness [51]. While unsupervised learning algorithms can identify anomalies
indicative of potential security breaches. The researchers develop a framework that provides
detection and explanation capabilities for anomalous network traffic. This framework consists of
two main components, Variational Autoencoder (VAE), and a gradient-based fingerprinting
technique employed for explaining anomalies. The framework indicate that this approach
successfully characterizes a range of anomalies and accurately identifies them through distinct
fingerprints [92].
AI-enabled IPS and IDS can enhance recall by providing more accurate and adaptive threat
detection capabilities, leading to better identification of security incidents, and reducing the
likelihood of missed detections. To validate their effectiveness in detecting intrusions, machine
learning-based IDS should demonstrate high accuracy, recall, and F-measure. Evaluation of
intrusion detection models reveals that those utilizing the iForest algorithm achieve superior
accuracy, precision, and recall, with lower false positive rates compared to those using the OCSVM
algorithm. In a separate trial, classifiers such as Random Forest and SVM (linear) demonstrate
high accuracy, precision, and recall, underscoring their efficacy in real-time detection of attacks
like Cross-Site Scripting [93] [94]. Classifiers, particularly SVM (linear), demonstrated high
performance with high accuracy, precision, and recall in discriminating between malicious and
benign scripts. Random Forest (RF) performed slightly better than other classifiers with an
accuracy of 99.91%, precision of 99.80%, and recall of 100%. The experiment results indicate that
the classifiers can effectively detect Cross-Site Scripting attacks in real-time [94]. Naive bayes VII
is seen to perform the worst among all the classifiers with zero precision and zero f1-score for SQL
[81].
AI algorithms can help reduce false positives by accurately distinguishing between normal
network behavior and genuine security threats. Unsupervised learning algorithms can be trained
to recognize patterns indicative of malicious activity, leading to fewer false alarms and improved
operational efficiency. The researchers proposed a method for anomaly detection using Auto-
encoder. The method aims to minimize false positives and false negatives by adjusting the
threshold for reconstructing errors. Comparative evaluation against mean and stochastic
thresholding methods on KDDCUP’99 network traffic connections shows that the proposed
method outperforms these traditional methods. The results suggest that Auto-encoder based
49
anomaly detection, using the proposed thresholding method, provides better classification
performance and can serve as a successful alternative to hybrid anomaly detection methods [80].
50
11. Future Work
Improving the interpretability of unsupervised learning models is also crucial for effective
anomaly detection. Researchers are exploring methods to make unsupervised learning models
more interpretable, enabling better understanding and validation of the learned representations.
Techniques such as visualization and explanation of the learned features, as well as interpretable
evaluation metrics, can aid in assessing the quality and relevance of the detected anomalies. Active
learning strategies offer a means to address the challenge of acquiring labeled data by selecting the
most informative samples for labeling. These strategies intelligently choose which instances to
label, thereby maximizing the utilization of limited labeled data and accelerating the training
process of unsupervised learning models. Active learning can be particularly useful in scenarios
where manual labeling is costly or impractical [96].
Finally, transfer learning and domain adaptation techniques can be employed to leverage
knowledge from related tasks or domains to enhance the performance of unsupervised learning
models. By transferring knowledge from pre-trained models or adapting models to new domains,
these methods can help overcome the limitations of unsupervised learning, especially in scenarios
where labeled data is scarce or unavailable.
51
12. Conclusion
In conclusion, the utilization of artificial intelligence (AI) techniques in bolstering the capabilities
of Intrusion Detection Systems (IDS) and Intrusion Prevention Systems (IPS) holds immense
potential for enhancing cybersecurity defenses. Through the application of supervised and
unsupervised learning algorithms, AI enables IDS and IPS to achieve greater efficiency and
accuracy in identifying and responding to security threats.
Supervised learning algorithms, such as Random Forest and Support Vector Machine (SVM), excel
in detecting known threats with high precision. These algorithms learn from labeled data, enabling
them to distinguish between normal and malicious network behavior effectively. Moreover,
unsupervised learning algorithms, including autoencoders, Isolation Forest, and One-Class
Support Vector Machine (OCSVM), provide valuable capabilities in identifying anomalies
indicative of potential security breaches. These algorithms analyze vast amounts of data without
explicit labels, making them particularly useful for detecting unknown threats and reducing false
positives. However, challenges remain in the development of effective anomaly detection systems.
The scarcity of labeled data poses a significant hurdle, which can be addressed through semi-
supervised learning techniques. By combining labeled and unlabeled data, semi-supervised
learning algorithms improve model performance and reduce the reliance on labeled data.
Additionally, self-supervised learning techniques offer promise in learning meaningful
representations from unlabeled data by formulating tasks that generate pseudo-labels from the data
itself. Interpretability of unsupervised learning models is another crucial aspect that requires
attention.
Enhancing the interpretability of these models through visualization, explanation of learned
features, and interpretable evaluation metrics is essential for better understanding and validation
of detected anomalies. Furthermore, active learning strategies can optimize the utilization of
limited labeled data by selecting the most informative samples for labeling, thus accelerating the
training process of unsupervised learning models. Lastly, transfer learning and domain adaptation
techniques can leverage knowledge from related tasks or domains to enhance the performance of
unsupervised learning models, especially in scenarios where labeled data is scarce or unavailable.
In conclusion, the integration of AI into IDS and IPS systems offers significant opportunities for
improving cybersecurity defenses. Future research should focus on addressing the limitations of
unsupervised learning, enhancing interpretability, and optimizing the utilization of labeled and
unlabeled data to develop more robust and effective anomaly detection systems for safeguarding
network infrastructure against evolving cyber threats.
52
References
[1] Z. Chiba, N. Abghour, K. Moussaid, O. Lifandali and R. Kinta, "Review of Recent Intrusion Detection
Systems and Intrusion Prevention Systems in IoT Networks," in International Conference on Software,
Telecommunications and Computer Networks (SoftCOM), Split, Croatia, 2022, pp. 1-6, doi:
10.23919/SoftCOM55329.2022.9911401, 2022.
[2] I. Sarker, M.H. & Nowrozy and R. Nowrozy, "AI-Driven Cybersecurity: An Overview, Security
Intelligence Modeling and Research Directions," https://doi.org/10.1007/s42979-021- 00557-0, 2021.
[3] "BM Data Breach Report," 2023, [Online]. Available: https://www.ibm.com/reports/data-breach.
[4] R. Calderon, "The Benefits of Artificial Intelligence in Cybersecurity," Economic Crime Forensics
Capstones. 36., 2019. [Online]. Available: https://digitalcommons.lasalle.edu/ecf_capstones/36.
[5] J. O. Mebawondu, O. D. Alowolodu, J. O. Mebawondu and A. O. Adetunmbi, "Network intrusion
detection system using supervised learning paradigm," in Scientific African, September 2020.
[6] M. Verkerken, L. D’hooge, T. Wauters, B. Volckaert and F. De Turck, "Unsupervised Machine Learning
Techniques for Network Intrusion Detection on Modern Data," in 2020 4th Cyber Security in Networking
Conference (CSNet), Lausanne, Switzerland, pp. 1-8, doi: 10.1109/CSNet50428.2020.9265461., 2020.
[7] A. Halimaa A. and K. Sundarakantham, "Machine Learning Based Intrusion Detection System," in 2019
3rd International Conference on Trends in Electronics and Informatics (ICOEI), Tirunelveli, India, pp. 916-
920, doi: 10.1109/ICOEI.2019.8862784., 2019.
[8] A. Krishna, A. Lal M.A, A. J. Mathewkutty, D. S. Jacob and M. Hari, "Intrusion Detection and
Prevention System Using Deep Learning," in 2020 International Conference on Electronics and Sustainable
Communication Systems (ICESC), Coimbatore, India, pp. 273-278, doi:
10.1109/ICESC48915.2020.9155711., 2020.
[9] R. Vinayakumar, M. Alazab, K. P. Soman, P. Poornachandran, A. Al-Nemrat and S. Venkatraman, "Deep
Learning Approach for Intelligent Intrusion Detection System," in in IEEE Access, vol. 7, pp. 41525-41550,
doi: 10.1109/ACCESS.2019.2895334, 2019.
[10] J. Jha and L. Ragha, "Intrusion Detection System using Support Vector," in International Conference
& workshop on Advanced Computing 2013, New York, USA, 2013.
[11] S. Parhizkari, Anomaly Detection in Intrusion Detection Systems, DOI: 10.5772/intechopen.112733,
2023.
[12] Peddabachigari, Sandhya, et al. "Modeling intrusion detection system using hybrid intelligent
systems." Journal of network and computer applications 30.1 (2007): 114-132.
[13] K. T. Khaing, "Recursive Feature Elimination (RFE) and k-Nearest Neighbor (KNN) in SVM," in
*Proceedings of the 2010 International Conference on Computer Science and Information Technology*,
2010, pp. 318-323.
[14] J.F Joseph, A. Das and B.C. Seet, "Cross-Layer Detection of Sinking Behavior in Wireless Ad Hoc
Networks Using SVM and FDA," in IEEE Transaction on dependable and secure computing, Vol. 8, No. 2,
2011.
[15] N. D. Patel, B. M. Mehtre and R. Wankar, "Detection of Intrusions using Support Vector Machines and
Deep Neural Networks," in 10th International Conference on Reliability, Infocom Technologies and
53
Optimization (Trends and Future Directions) (ICRITO) doi: 10.1109/ICRITO56286.2022.9964756, Noida,
India, 2022
[16] R. Saravanan and P. Sujatha, "A State of Art Techniques on Machine Learning Algorithms: A
Perspective of Supervised Learning Approaches in Data Classification," in Second International
Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, 2018, pp. 945-949,
doi: 10.1109/ICCONS.2018.8663155, 2018
[17] M. Masdari and H. Khezri, "A survey and taxonomy of the fuzzy signature-based Intrusion Detection
Systems," in Applied Soft Computing, doi.org/10.1016/j.asoc.2020.106301, 2019.
[18] A. Aldweesh, A. Derhab and A. Z. Emam, "Deep learning approaches for anomaly-based intrusion
detection systems: A survey, taxonomy, and open issues," in Knowledge-Based Systems,
doi.org/10.1016/j.knosys.2019.105124, 2020.
[19] Yu Xiao and Maria Watson, "Guidance on Conducting a Systematic Literature Review," in Journal of
Planning Education and Research, doi.org/10.1177/0739456X17723971, 2017.
[20] W. Mengist, T. Soromessa and G. Legese, "Method for conducting systematic literature review and
meta-analysis for environmental science research," in MethodsX, doi.org/10.1016/j.mex.2019.100777,
2020.
[21] A. Omazic and B. Zunk, "Semi-Systematic Literature Review on Sustainabillity and Sustainable
Development in Higher Education Istitutions," in Institute of Business Economics and Industrial Sociology,
Graz University of Technology, 8010 Graz, Austria, doi.org/10.3390/su13147683, 2021.
[22] "Integrative Review," Adelphi University, 2020. [Online]. Available:
https://libguides.adelphi.edu/Systematic_Reviews/integrative-review. [Accessed 25 Feb 2024].
[23] M. A. Ferrag, L. Maglaras, S. Moschoyiannis and H. Janicke, "Deep learning for cyber security
intrusion detection: Approaches, datasets, and comparative study," in Journal of Information Security and
Applications, doi.org/10.1016/j.jisa.2019.102419, 2020.
[24] Y. SAHLI, "A comparison of the NSL-KDD dataset and its predecessor the KDD Cup ’99 dataset," in
International Journal of Scientific Research and Management (IJSRM) Volume10, Issue 04, DOI:
10.18535/ijsrm/v10i4.ec05, 2020.
[25] A. Thakkar and R. Lohiya, "A Review of the Advancement in Intrusion Detection Datasets," in
International Conference on Computational Intelligence and Data Science (ICCIDS 2019), 2019.
[26] R. Panigrahi and S. Borah, "A detailed analysis of CICIDS2017 dataset for designing Intrusion
Detection Systems," in International Journal of Engineering & Technology, 7 (3.24) (2018) 479-482, 2018.
[27] Kurniabudi, D. Stiawan, Darmawijoyo, M. Y. Bin Idris, A. M. Bamhdi and R. Budiarto, "CICIDS-
2017 Dataset Feature Analysis with Information Gain for Anomaly Detection," in in IEEE Access, vol. 8,
pp. 132911-132921, 2020, doi: 10.1109/ACCESS.2020.3009843, 2020.
[28] T. Su, H. Sun, J. Zhu, S. Wang and Y. Li, "BAT: Deep Learning Methods on Network Intrusion
Detection Using NSL-KDD Dataset," in in IEEE Access, vol. 8, pp. 29575-29585, doi:
10.1109/ACCESS.2020.2972627, 2020.
[29] Kim, J.; Kim, J.; Kim, H.; Shim, M.; Choi, E. CNN-Based Network Intrusion Detection against Denial-
of- Service Attacks. Electronics 2020, 9, 916. https://doi.org/10.3390/electronics9060916
54
[30] I. Ahmad, A. B. Abdullah, and A. S. Alghamdi, "Remote to Local attack detection using supervised
neural network," 2010 International Conference for Internet Technology and Secured Transactions,
London, UK, 2010, pp. 1-6.
[31] K. Labib and R. V. Vemuri, “Detecting and Visualizing Denial-of-Service and Network Probe Attacks
Using Principal Component Analysis” 2008
[32] Jeya, P. Gifty, M. Ravichandran, and C. S. Ravichandran. "Efficient classifier for R2L and U2R
attacks." International Journal of Computer Applications 45.21 (2012): 28-32.
[33] L. Singh,and H. Jahankhani, “ An Approach of Applying, Adapting Machine Learning into the IDS
and IPS Component to Improve Its Effectiveness and Its Efficiency”, Artificial Intelligence in Cyber
Security: Impact and Implications, 2021, DOI: 10.1007/978-3-030-88040-8_2
[34] M. Mohammadi, M. Dawodi, W. Tomohisa and N. Ahmadi, "Comparative study of supervised learning
algorithms for student performance prediction," 2019 International Conference on Artificial Intelligence in
Information and Communication (ICAIIC), Okinawa, Japan, 2019, pp. 124-127, doi:
10.1109/ICAIIC.2019.8669085.
[35] R. E. AlMamlook, K. M. Kwayu, M. R. Alkasisbeh and A. A. Frefer, "Comparison of Machine
Learning Algorithms for Predicting Traffic Accident Severity," 2019 IEEE Jordan International Joint
Conference on Electrical Engineering and Information Technology (JEEIT), Amman, Jordan, 2019, pp.
272-276, doi: 10.1109/JEEIT.2019.8717393.
[36] H. Hu and Z. Zhou, "Evaluation and Comparison of Ten Machine Learning Classification Models
Based on the Mobile Users Experience," 2023 3rd International Conference on Electronic Information
Engineering and Computer Science (EIECS), Changchun, China, 2023, pp. 767-771, doi:
10.1109/EIECS59936.2023.10435603.
[37] Kunal and M. Dua, "Machine Learning Approach to IDS: A Comprehensive Review," 2019 3rd
International conference on Electronics, Communication and Aerospace Technology (ICECA),
Coimbatore, India, 2019, pp. 117-121, doi: 10.1109/ICECA.2019.8822120.
[38] L. A. H. Ahmed and Y. A. M. Hamad, "Machine Learning Techniques for Network-based Intrusion
Detection System: A Survey Paper," 2021 National Computing Colleges Conference (NCCC), Taif, Saudi
Arabia, 2021, pp. 1-7, doi: 10.1109/NCCC49330.2021.9428827.
[39] J. Alzubi et al. “Machine Learning from Theory to Algorithms: An Overview”, IOP Conf. Series:
Journal of Physics: Conf. Series 1142 (2018) 012012, doi:10.1088/1742-6596/1142/1/012012
[40] Mahesh, Batta. "Machine learning algorithms-a review." International Journal of Science and Research
(IJSR). [Internet] 9.1 (2020): 381-386.
[41] N. Chakraborty, “Intrusion Detection System and Intrusion Prevention System: A Comparative Study”,
International Journal of Computing and Business Research (IJCBR) ISSN (Online): 2229-6166 Volume 4
Issue 2013, Indore, India
[42] F. -J. Yang, "An Extended Idea about Decision Trees," 2019 International Conference on
Computational Science and Computational Intelligence (CSCI), Las Vegas, NV, USA, 2019, pp. 349-354,
doi: 10.1109/CSCI49370.2019.00068.
[43] Sen, Pratap Chandra, Mahimarnab Hajra, and Mitadru Ghosh. "Supervised classification algorithms in
machine learning: A survey and review." Emerging Technology in Modelling and Graphics: Proceedings of
IEM Graph 2018. Springer Singapore, 2020.
55
[44] H. Wang and B. Chen, "Intrusion detection system based on multi-strategy pruning algorithm of the
decision tree," Proceedings of 2013 IEEE International Conference on Grey systems and Intelligent
Services (GSIS), Macao, China, 2013, pp. 445-447, doi: 10.1109/GSIS.2013.6714823.
[45] A. A. Wighneswara et al., "Network Behavior Anomaly Detection using Decision Tree," 2023 IEEE
12th International Conference on Communication Systems and Network Technologies (CSNT), Bhopal,
India, 2023, pp. 705-709, doi: 10.1109/CSNT57126.2023.10134589.
[46] K. R. Saraswat, M. S.Devi, D. Professor and A. Guleria, ” Decision Tree Based Algorithm for Intrusion
Detection” International Journal of Advanced Networking and Applications, 2016
[47] H. Liu and M. Zhou, "Decision tree rule-based feature selection for large-scale imbalanced data," 2017
26th Wireless and Optical Communication Conference (WOCC), Newark, NJ, USA, 2017, pp. 1-6, doi:
10.1109/WOCC.2017.7928973.
[48] G. Chaaya and H. Maalouf, "Anomaly detection on a real-time server using decision trees step by step
procedure," 2017 8th International Conference on Information Technology (ICIT), Amman, Jordan, 2017,
pp. 127-133, doi: 10.1109/ICITECH.2017.8079989.
[49] M. Choubisa, R. Doshi, N. Khatri and K. Kant Hiran, "A Simple and Robust Approach of Random
Forest for Intrusion Detection System in Cyber Security," 2022 International Conference on IoT and
Blockchain Technology (ICIBT), Ranchi, India, 2022, pp. 1-5, doi: 10.1109/ICIBT52874.2022.9807766.
[50] S. Afroz, S. M. Ariful Islam, S. Nawer Rafa and M. Islam, "A Two Layer Machine Learning System
for Intrusion Detection Based on Random Forest and Support Vector Machine," 2020 IEEE International
Women in Engineering (WIE) Conference on Electrical and Computer Engineering (WIECON-ECE),
Bhubaneswar, India, 2020, pp. 300-303, doi: 10.1109/WIECON-ECE52138.2020.9397945.
[51] Z. F. Faruq, T. Mantoro, M. A. Catur Bhakti and Wandy, "Random Forest Classifier Evaluation in
DDoS Detection System for Cyber Defence Preparation," 2022 IEEE 8th International Conference on
Computing, Engineering and Design (ICCED), Sukabumi, Indonesia, 2022, pp. 1-5, doi:
10.1109/ICCED56140.2022.10010341.
[52] Liu, Zhenpeng, et al. "A deep random forest model on spark for network intrusion detection." Mobile
Information Systems 2020 (2020): 1-16.
[53] Shah, Sandeep, et al. "Implementing a network intrusion detection system using semi-supervised
support vector machine and random forest." Proceedings of the 2021 ACM southeast conference. 2021.
[54] Bhati, Bhoopesh Singh, and Chandra Shekhar Rai. "Analysis of support vector machine-based
intrusion detection techniques." Arabian Journal for Science and Engineering 45.4 (2020): 2371-2383.
[55] Mohammadi, Mokhtar, et al. "A comprehensive survey and taxonomy of the SVM-based intrusion
detection systems." Journal of Network and Computer Applications 178 (2021): 102983.
[56] Kim, D.S., Park, J.S. “Network-Based Intrusion Detection with Support Vector Machines”. In: Kahng,
HK. (eds) Information Networking. ICOIN 2003. Lecture Notes in Computer Science, vol 2662, (2003).
Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45235-5_73
[57] I. Das, S. Singh and A. Sarkar, "Serial and Parallel based Intrusion Detection System using Machine
Learning," 2021 Devices for Integrated Circuit (DevIC), Kalyani, India, 2021, pp. 340-344, doi:
10.1109/DevIC50843.2021.9455936.
[58] A. BACHAR, N. E. MAKHFI and O. E. Bannay, "Towards a behavioral network intrusion detection
system based on the SVM model," 2020 1st International Conference on Innovative Research in Applied
56
Science, Engineering and Technology (IRASET), Meknes, Morocco, 2020, pp. 1-7, doi:
10.1109/IRASET48871.2020.9092094.
[59] Razdan, Sanjay, Himanshu Gupta, and Ashish Seth. "Performance analysis of network intrusion
detection systems using j48 and naive bayes algorithms." 2021 6th International Conference for
Convergence in Technology (I2CT). IEEE, 2021.
[60] M. Panda and M. R. Patra, "Network intrusion detection using naive bayes," International Journal of
Computer Science and Network Security, vol. 7, no. 12, pp. 258-263, Dec. 2007. Available: ResearchGate.
[Online].Available:
https://www.researchgate.net/publication/241397131_Network_intrusion_detection_using_naive_bayes.
[61] M. Vishwakarma, and N. Kesswani, “A new two-phase intrusion detection system with Naïve Bayes
machine learning for data classification and elliptic envelop method for anomaly detection”, 2023,
Department of Computer Science, Central University of Rajasthan, Ajmer, India.
https://doi.org/10.1016/j.dajour.2023.100233
[62] F. Gumus, C. O. Sakar, Z. Erdem and O. Kursun, "Online Naive Bayes classificationn for network
intrusion detection," 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis
and Mining (ASONAM 2014), Beijing, China, 2014, pp. 670-674, doi: 10.1109/ASONAM.2014.6921657.
[63] V. D. Katkar and S. V. Kulkarni, "Experiments on detection of Denial-of-Service attacks using Naive
Bayesian classifier," 2013 International Conference on Green Computing, Communication and
Conservation of Energy (ICGCE), Chennai, India, 2013, pp. 725-730, doi: 10.1109/ICGCE.2013.6823529.
[64] S. Mukherjee Dr., and N. Sharma, “Intrusion Detection using Naive Bayes Classifier with Feature
Reduction”, Department of Computer Science, Banasthali University, Jaipur,Rajasthan, 304022,India,
2012, https://doi.org/10.1016/j.protcy.2012.05.017
[65] T. Tun, K. K. Wai and M. S. Khaing, "Performance of Machine Learning Using Preprocessing and
Classification for Intrusion Detection System," 2023 IEEE Conference on Computer Applications (ICCA),
Yangon, Myanmar, 2023, pp. 260-265, doi: 10.1109/ICCA51723.2023.10181620
[66] A. S. Gillis, “Unsupervised Learning”, TechTarget, n.d.,
https://www.techtarget.com/searchenterpriseai/definition/unsupervised-learning, accessed 23-04-2023
[67] Verkerken, Miel, et al. "Unsupervised machine learning techniques for network intrusion detection on
modern data." 2020 4th Cyber Security in Networking Conference (CSNet). IEEE, 2020.
[68] Verkerken, Miel, et al. "Towards model generalization for intrusion detection: Unsupervised machine
learning techniques." Journal of Network and Systems Management 30 (2022): 1-25.
[69] D. Deng, "Research on Anomaly Detection Method Based on DBSCAN Clustering Algorithm," 2020
5th International Conference on Information Science, Computer Technology and Transportation (ISCTT),
Shenyang, China, 2020, pp. 439-442, doi: 10.1109/ISCTT51595.2020.00083.
[70] A. F. Diallo and P. Patras, "Adaptive Clustering-based Malicious Traffic Classification at the Network
Edge," IEEE INFOCOM 2021 - IEEE Conference on Computer Communications, Vancouver, BC, Canada,
2021, pp. 1-10, doi: 10.1109/INFOCOM42981.2021.9488690.
[71] G. Pu, L. Wang, J. Shen and F. Dong, "A hybrid unsupervised clustering-based anomaly detection
method," in Tsinghua Science and Technology, vol. 26, no. 2, pp. 146-153, April 2021, doi:
10.26599/TST.2019.9010051.
[72] Almaiah, Mohammed Amin, et al. "Performance investigation of principal component analysis for
intrusion detection system using different support vector machine kernels." Electronics 11.21 (2022): 3571.
57
[73] A. Mishra, A. M. K. Cheng and Y. Zhang, "Intrusion Detection Using Principal Component Analysis
and Support Vector Machines," 2020 IEEE 16th International Conference on Control & Automation
(ICCA), Singapore, 2020, pp. 907-912, doi: 10.1109/ICCA51439.2020.9264568.
[74] S. Waskle, L. Parashar and U. Singh, "Intrusion Detection System Using PCA with Random Forest
Approach," 2020 International Conference on Electronics and Sustainable Communication Systems
(ICESC), Coimbatore, India, 2020, pp. 803-808, doi: 10.1109/ICESC48915.2020.9155656.
[75] Sharipuddin et al., "Features Extraction on IoT Intrusion Detection System Using Principal
Components Analysis (PCA)," 2020 7th International Conference on Electrical Engineering, Computer
Sciences and Informatics (EECSI), Yogyakarta, Indonesia, 2020, pp. 114-118, doi:
10.23919/EECSI50503.2020.9251292.
[76] Bharadiya, Jasmin Praful. "A tutorial on principal component analysis for dimensionality reduction in
machine learning." International Journal of Innovative Science and Research Technology 8.5 (2023): 2028-
2032.
[77] Snehaa S, Shwetha G, and B. Priya, "Network Intrusion Detector Based On Isolation … Forest
Algorithm," 2022 1st International Conference on Computational Science and Technology (ICCST),
CHENNAI, India, 2022, pp. 932-935, doi: 10.1109/ICCST55948.2022.10040395.
[78] A. Vikram and Mohana, "Anomaly detection in Network Traffic Using Unsupervised Machine learning
Approach," 2020 5th International Conference on Communication and Electronics Systems (ICCES),
Coimbatore, India, 2020, pp. 476-479, doi: 10.1109/ICCES48766.2020.9137987.
[79] A. Patil, D. Machale, D. Goswami, P. Muley and P. Rajarapollu, "Anomaly-Based Intrusion Detection
System for IoT Environment Using Machine Learning," 2023 IEEE International Carnahan Conference on
Security Technology (ICCST), Pune, India, 2023, pp. 1-4, doi: 10.1109/ICCST59048.2023.10474238.
[80] K. Shanthi and R. Maruthi, "Machine Learning Approach for Anomaly-Based Intrusion Detection
Systems Using Isolation Forest Model and Support Vector Machine," 2023 5th International Conference on
Inventive Research in Computing Applications (ICIRCA), Coimbatore, India, 2023, pp. 136-139, doi:
10.1109/ICIRCA57980.2023.10220620.
[81] S. Bhadauria and T. Mohanty, "Hybrid Intrusion Detection System using an Unsupervised method for
Anomaly-based Detection," 2021 IEEE International Conference on Advanced Networks and
Telecommunications Systems (ANTS), Hyderabad, India, 2021, pp. 1-6, doi:
10.1109/ANTS52808.2021.9936919.
[82] Wenli Shang, Lin Li, Ming Wan and Peng Zeng, "Industrial communication intrusion detection
algorithm based on improved one-class SVM," 2015 World Congress on Industrial Control Systems
Security (WCICSS), London, 2015, pp. 21-25, doi: 10.1109/WCICSS.2015.7420317.
[83] M. Zhang, B. Xu and J. Gong, "An Anomaly Detection Model Based on One-Class SVM to Detect
Network Intrusions," 2015 11th International Conference on Mobile Ad-hoc and Sensor Networks (MSN),
Shenzhen, China, 2015, pp. 102-107, doi: 10.1109/MSN.2015.40.
[84] H. Dong and D. Peng, "Research on abnormal detection of ModbusTCP/IP protocol based on one-
class SVM," 2018 33rd Youth Academic Annual Conference of Chinese Association of Automation (YAC),
Nanjing, China, 2018, pp. 398-403, doi: 10.1109/YAC.2018.8406407.
[85] L. Mhamdi, D. McLernon, F. El-moussa, S. A. Raza Zaidi, M. Ghogho and T. Tang, "A Deep Learning
Approach Combining Autoencoder with One-class SVM for DDoS Attack Detection in SDNs," 2020 IEEE
Eighth International Conference on Communications and Networking (ComNet), Hammamet, Tunisia,
2020, pp. 1-6, doi: 10.1109/ComNet47917.2020.9306073.
58
[86] Siregar, Sahrul Mulia, Yudha Purwanto, and Suryo Adhi Wibowo. "Enhancing Network Anomaly
Detection with Optimized One-Class SVM (OCSVM)." 2023 3rd International Conference on Intelligent
Cybernetics Technology & Applications (ICICyTA). IEEE, 2023.
[87] Y. N. Nguimbous, R. Ksantini and A. Bouhoula, "Anomaly-based Intrusion Detection Using Auto-
encoder," 2019 International Conference on Software, Telecommunications and Computer Networks
(SoftCOM), Split, Croatia, 2019, pp. 1-5, doi: 10.23919/SOFTCOM.2019.8903799.
[88] V. Q. Nguyen, V. H. Nguyen, T. H. Hoang and N. Shone, "A Novel Deep Clustering Variational Auto-
Encoder for Anomaly-based Network Intrusion Detection," 2022 14th International Conference on
Knowledge and Systems Engineering (KSE), Nha Trang, Vietnam, 2022, pp. 1-7, doi:
10.1109/KSE56063.2022.9953763.
[89] B. Abolhasanzadeh, "Nonlinear dimensionality reduction for intrusion detection using auto-encoder
bottleneck features," 2015 7th Conference on Information and Knowledge Technology (IKT), Urmia, Iran,
2015, pp. 1-5, doi: 10.1109/IKT.2015.7288799.
[90] H. Xie, A. Li, R. Jiang, Y. Jia, L. Huang and W. Han, "Intrusion Detection Results Analysis Based on
Variational Auto-Encoder," 2019 IEEE Fourth International Conference on Data Science in Cyberspace
(DSC), Hangzhou, China, 2019, pp. 516-521, doi: 10.1109/DSC.2019.00084.
[91] R. Zhang and H. Chen, "Intrusion Detection of Industrial Control System Based on Stacked Auto-
Encoder," 2019 Chinese Automation Congress (CAC), Hangzhou, China, 2019, pp. 5638-5643, doi:
10.1109/CAC48633.2019.8997243.
[92] Nguyen, Quoc Phong, et al. "Gee: A gradient-based explainable variational autoencoder for network
anomaly detection." 2019 IEEE Conference on Communications and Network Security (CNS). IEEE, 2019.
[93] Ahmad, Zeeshan, et al. "Network intrusion detection system: A systematic study of machine learning
and deep learning approaches." Transactions on Emerging Telecommunications Technologies 32.1 (2021):
e4150.
[94] H. -C. Chen, A. Nshimiyimana, C. Damarjati and P. -H. Chang, "Detection and Prevention of Cross-
site Scripting Attack with Combined Approaches," 2021 International Conference on Electronics,
Information, and Communication (ICEIC), Jeju, Korea (South), 2021, pp. 1-4, doi:
10.1109/ICEIC51217.2021.9369796.
[95] H. Kye, M. Kim and M. Kwon, "Hierarchical Detection of Network Anomalies: A Self-Supervised
Learning Approach," in IEEE Signal Processing Letters, vol. 29, pp. 1908-1912, 2022, doi:
10.1109/LSP.2022.3203296.
[96] J. -X. Mi, A. -D. Li and L. -F. Zhou, "Review Study of Interpretation Methods for Future Interpretable
Machine Learning," in IEEE Access, vol. 8, pp. 191969-191985, 2020, doi:
10.1109/ACCESS.2020.3032756.
59