0% found this document useful (0 votes)

15 views14 pages

Sysj 20

Uploaded by

usmanmotti7678

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views14 pages

Sysj 20

Uploaded by

usmanmotti7678

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

IEEE SYSTEMS JOURNAL 1

A Comprehensive Survey of Databases and

Deep Learning Methods for Cybersecurity and
Intrusion Detection Systems
D. Gümüşbaş, T. Yıldırım, A. Genovese, and F. Scotti, Senior Member, IEEE

Abstract—This survey presents a comprehensive overview of Different IDSs can employ diverse algorithms for detecting
Machine Learning (ML) methods for cybersecurity intrusion attacks. These algorithms can be classified into three categories
detection systems, with a specific focus on recent approaches [10]: i) rule-based algorithms, which use prior knowledge
based on Deep Learning (DL). The review analyzes recent
methods with respect to their intrusion detection mechanisms, of attacks, such as the corresponding data distributions, to
performance results, and limitations as well as whether they use create a rule system and perform detection; ii) statistics-based
benchmark databases to ensure a fair evaluation. In addition, a algorithms, which detect anomalies by building a statistical
detailed investigation of benchmark datasets for cybersecurity distribution of intrusion patterns; and iii) Machine Learning
is presented. This paper is intended to provide a road map (ML)-based approaches, in which learning algorithms are
for readers who would like to understand the potential of DL
methods for cybersecurity and intrusion detection systems, along adopted to train classifiers that can distinguish among different
with a detailed analysis of the benchmark datasets used in the types of attacks.
literature to train DL models. Rule-based methods, while simple and fast to execute,
Index Terms—Cybersecurity, IDS, Deep Learning. cannot compensate for incomplete or noisy data and are
difficult to update. To overcome these problems, statistics-
I. I NTRODUCTION based approaches have been proposed to enable the processing
of imprecise information; however, such methods entail a

C YBERSECURITY systems have been of great impor-

tance since the beginning of the computer network era.
However, security attacks emerged even before that: in 1941,
high computational cost and have a limited ability to handle
large quantities of data. Recently, ML-based approaches have
increasingly been studied due to their ability to use complex
Alan Turing cracked the Enigma machine, which was designed inference models that can be trained on large quantities of data
to cipher messages [1]. Similar incidents have continued to oc- to detect complex intrusion patterns [11].
cur through the present day; for example, the electrical system Due to the increasing quantities of data transmitted over
of Massachusetts Institute of Technology (MIT) was hacked the internet, which are leading to the introduction of new
in 1950, Yahoo accounts were stolen in 2014, and several networking paradigms (e.g., the Internet of Things – IoT,
worldwide organizations were affected by the WannaCry worm cloud computing, and fog/edge computing [12], [13]) and
in 2017 [2], [3]. According to the August 2019 threat reports complex inference models (e.g., Deep Learning (DL) [14],
from McAfee Labs, the top ten attack vectors at present are [15]), throughout the remainder of this paper, we will focus
malware, account hijacking, unknown, vulnerability, unautho- on ML-based approaches to cybersecurity and IDSs.
rized access, targeted attack, code injection, Denial of Service
(DoS), defacement, and theft [4]. Hence, not only known but A. Previous Surveys
also unknown attack vectors currently pose a significant cyber This subsection introduces previous surveys published in the
threat. literature on cybersecurity, with a specific focus on ML-based
The recent increase in the volume of data generated and methods. These surveys were chosen based on the criterion
transmitted over the internet, the need to manage the security of being either the most cited or a pacesetter review on a
of such data [5]–[7], and the continuing changes/evolution in specific topic. In this paper, in contrast to other surveys in the
intrusion types are causing both the academic and industrial literature, we summarize existing surveys based on their strong
communities to show increasing interest in the deployment of points. The objective is to help readers find further material
cybersecurity systems [8], [9]. To shield computer networks in accordance with their interests.
from attacks, Intrusion Detection Systems (IDSs) are being One of the most cited surveys in the literature is presented
deployed to support user authentication, ensure safe access, in [11]. This survey addresses the different ML methods used
and prevent loss of privacy. An IDS first collects and pro- in IDSs; describes the structure of Internet Protocol (IP) traffic
cesses data and then applies a detection mechanism to raise features, such as port-based, payload-based, and statistical
alarms, which are sent to a human network analyst for further features; and provides insight into feature categories such
screening. as packet-level and flow-level features. Although this review
D. Gümüşbaş and T. Yıldırım are with the Department of Electronics and was performed using a limited number of papers published
Communication Engineering, Yıldız Technical University, Istanbul, Turkey between 2004 and 2007 and some of the benchmark datasets
(e-mail: f0415058@std.yildiz.edu.tr; tulay@yildiz.edu.tr). described are now out of circulation, it presents valuable
A. Genovese and F. Scotti are with the Department of Computer Science,
Università degli Studi di Milano, Italy (e-mail: angelo.genovese@unimi.it; information on how to discover and extract novel IP-based
fabio.scotti@unimi.it). intrusion patterns/features from network traffic. The review
IEEE SYSTEMS JOURNAL 2

presented in [16], in addition to describing the various ML- puzzles; the limitations of attempts to detect these attacks; and
based methods of network intrusion detection, focuses on the attack generation scenarios.
characteristics of the types of intrusion. Therefore, this review Finally, there are several surveys that address specific as-
presents how available statistical features can be used and pects or applications of IDSs. For example, the work reported
modified for distributed attack detection and the importance in [24] focuses on IDSs for IoT systems, describing their
of the threshold used to process these types of features. taxonomy and placement strategies. In a similar manner, the
In contrast to [11], [16], the survey presented in [17] focuses review presented in [25] discusses DM concepts with IoT
on the use of ML and Data Mining (DM) concepts in IDSs. applications. Another example is the survey in [26], which
This review includes a clear explanation of ML and DM algo- covers only unsupervised methods used in IDSs. Although
rithms introduced in highly cited papers published before 2016 this review is limited to unsupervised methods, it is a good
as well as their usage in IDSs. Notably, this review does not reference for learning about a variety of feature selection
include the newest DL methods, such as Convolutional Neural methods. Additionally, datasets and EU standards (e.g., the
Networks (CNN); the newest datasets, such as AWID2018 General Data Protection Regulation – GDPR) for data col-
and CICIDS2017; or practical details such as attack frequency lection and protection are addressed in this review. Other
and sample size for the benchmark datasets. Nevertheless, this reviews considering specific aspects of this field include the
review does consider fuzzy logic, neural networks, genetic work described in [27], which focuses on hardware techniques
algorithms, and rule-based algorithms. for IDS implementation; the paper presented in [28], which
Similar to the survey presented in [17], the work reported in considers only immunity-based approaches; and the survey
[18] provides a review of ML methods for IDSs, associating published in [29], which describes network security techniques
different types of attacks with the features that can be used to for supervisory control and data acquisition systems.
detect them. In particular, the associated features can provide
insight into how similar features of different types of intru-
sion can support similar approaches to attack detection. For B. Contributions
example, the duration and service features from the KDD99
dataset are the most highly contributing features for detecting This work is intended to serve as an extensive survey
both User-to-Root (U2R) and Remote-to-Local (R2L) attacks, of databases and methods based on ML and DL that have
often causing these two attack types to be misclassified as one been introduced thus far in the literature on cybersecurity and
another. Although this paper fails to investigate the newest DL intrusion detection. This survey focuses on papers published
algorithms and attack types and their most related features, it after 2013, with some exceptions being trendsetter algorithms
provides an extensive survey of feature selection methods. or highly cited papers.
The review published in [19] surveys ML-based intrusion Compared to the other surveys on intrusion detection dis-
detection methods alongside newer DL-based methods. Al- cussed in Section I-A, this survey makes three main contribu-
though this survey focuses on certain specific ML and DL tions: i) it summarizes previous surveys with regard to their
methods, such as Deep Belief Networks (DBNs) and Recur- level of detail in describing methods for cybersecurity, with the
rent Neural Networks (RNNs), as well as known benchmark purpose of encouraging further reading based on the readers’
datasets, it does not cover other DL algorithms, such as CNNs, interests; ii) it focuses on a practical perspective when describ-
or benchmark datasets such as CICIDS2017. The reviews ing the relevant datasets, specifically addressing the number
presented in [14], [20], [21] also consider DL-based methods. of features, the feature types, and attack distributions rather
However, they focus on only a subset of these methods, do than describing general details, feature selection methods, and
not discuss benchmark datasets, or do not provide detailed algorithms, which are analyzed in other surveys; and iii) it
descriptions of the accuracies achieved using DL methods. presents a comprehensive investigation of the newest DL meth-
In contrast to the previously mentioned surveys, the work ods for intrusion detection, analyzing their detection capability,
presented in [22] focuses on the different types of attacks performance, and limitations as well as the databases used.
rather than algorithms for IDSs, without providing details on This review does not consider previous types of ML methods
accuracy. Furthermore, this paper presents an attack taxon- since they have been thoroughly addressed in other survey
omy to provide detailed definitions of various attack types, papers [11], [17], [18].
including how and in which layers they occur. Attack tools The remainder of this paper is organized as follows. Sec-
are also explained in great detail for readers who wish to tion II presents a review of cybersecurity datasets, including
build IDSs for protection against specific attack types. Al- the data collection steps, feature and attack types, bench-
though this paper does not provide detailed information about mark datasets, and reliability criteria. Section III reviews and
new benchmark datasets or DL algorithms, a brief review analyzes DL-based intrusion detection methods, considering
on industrial IDSs, such as programmable logic controller DBNs, Autoencoders (AEs), CNNs, Long Short-Term Mem-
systems, is presented. Similarly, the review published in [23] ory (LSTM) networks, and Generative Adversarial Networks
addresses only application-layer Distributed DoS (DDoS) at- (GANs). Section IV provides a discussion of and insights into
tacks, describing how they are hidden behind low traffic and the limitations and current research trends regarding public
the features used to detect DDoS attacks occurring in the datasets and IDSs. Finally, Section V concludes this work.
application layer. Furthermore, this review discusses defense Table I summarizes the acronyms and notations used in this
mechanisms for protecting against these attacks, such as user paper.
IEEE SYSTEMS JOURNAL 3

TABLE I TABLE II
L IST OF ACRONYMS AND NOTATIONS USED IN THIS PAPER P ROGRAMS USED TO CAPTURE AND PREPROCESS NETWORK TRAFFIC
Notation Description Method Step Program Ref.
ML Machine Learning libPCAP [32]
DL Deep Learning Capture winPCAP [33]
DM Data Mining SNORT [34]
IDS Intrusion Detection System Wireshark [35]
IoT Internet of Things PCAP tshark [36]
IP Internet Protocol tcpdump [37]
Preprocessing
TCP Transmission Control Protocol networkminer [38]
UDP User Datagram Protocol rapidminer [39]
GDPR General Data Protection Regulation scapy [40]
PCAP Packet CAPture Cisco NetFlow [41]
NetFlow Capture/Preprocessing
SSH Secure Shell nfdump [42]
FTP File Transfer Protocol
SQL Structured Query Language
SYN TCP packet used to request a connection
case of nonflooding attack types such as R2L and U2R
DoS Denial of Service attacks, which are performed using packet payloads.
DDoS Distributed Denial of Service
U2R User-to-Root
NetFlow enables the collection of summary information
R2L Remote-to-Local or certain predefined attributes related to the flow of
XSS Cross-Site Scripting
k-NN k-Nearest Neighbor
packets in a network. Examples of the features that can be
ANN Artificial Neural Network extracted include the number of packets in given a time
SVM Support Vector Machine
RBM Restricted Boltzmann Machine
period or the size of data transmitted over the network.
DBN Deep Belief Network Although data collection via NetFlow is more memory
AE Autoencoder
CNN Convolutional Neural Network
efficient than data collection via PCAP, only summary
RNN Recurrent Reural Network data are considered, and it is not possible to extract new
LSTM Long Short-Term Memory
GAN Generative Adversarial Networks
types of features to address new needs.
PCA Principal Component Analysis The most commonly used programs for performing PCAP
are libPCAP, winPCAP and SNORT. In addition, several
II. C YBERSECURITY DATASETS programs allow the preprocessing of PCAP files to extract
This section presents a review of cybersecurity datasets, different types of features. For example, such preprocessing
outlining the data collection steps, feature and attack types, programs include Wireshark, tshark, tcpdump, networkminer,
available benchmark databases, and reliability criteria. rapidminer and scapy. The most commonly used programs for
capturing and preprocessing NetFlow data are Cisco NetFlow
A. Data Collection and nfdump. Table II summarizes the various programs used
to capture and preprocess network traffic using the PCAP and
This section presents the methods of data collection used NetFlow methodologies.
in cybersecurity applications. Specifically, data collection can
be performed in two different ways. The first is based on B. Feature Types
processing system calls (system logs) from host-based oper- This section examines the types of features extracted from
ating systems. The second is based on packet headers and the available datasets. Although new features are added when
payloads extracted from network traffic packages and from novel attack patterns are discovered, there are several reoccur-
applications using the Transmission Control Protocol (TCP)/IP ring feature types in the literature.
communication stack [30]. First, a distinction can be drawn between host-based and
The two main methodologies used to collect network traffic network-based data based on the procedure used to collect the
in the second way are full Packet CAPture (PCAP) and the data, as described in Section II-A. In most cases, host-based
NetFlow protocol: data are composed of system/operation logs, which consist of
PCAP enables the collection of the most detailed data attributes such as system calls. Feature extraction from system
from a network because it involves the extraction of calls is generally performed using methods based on natural
whole network packets (including packet headers) for language processing, such as n-grams [43].
all information being transmitted. In particular, the data On the other hand, network-based data are obtained by
collected from such packets include the packet size, collecting network traffic data. However, network traffic is
protocol types, headers of flows, flags, source and des- composed of many individual packets/frames, and feature
tination IP addresses, and source and destination port extraction must be performed for each traffic session, known
numbers [31]. However, the information contained in as flow-level traffic data, to reduce the dimensionality of the
the payload of a packet may be deleted or anonymized data and detect intrusions. Such feature extraction is conducted
due to privacy issues. In fact, a packet payload may based on three different types of features: basic, traffic-based,
contain sensitive data such as private information, instant and content-based features.
messaging conversations, or a history of visited websites. Basic features are extracted from TCP/IP connections and
In most cases, a trade-off must be established between can be classified as header-based, flow-based, connection-
anonymizing payloads to protect user privacy and using based, or packet-based features. Header-based features are
all collected data to achieve accurate attack detection. related to the packet header and include the source and
This trade-off is especially important to consider in the destination IP addresses, the TCP and User Datagram
IEEE SYSTEMS JOURNAL 4

Protocol (UDP) source and destination ports, the IP Denial of Service (DoS) [45] attacks are based on tem-
protocol, the service, and the IP header length. Flow- porarily blocking the normal use of network utilities
based features include attributes computed through anal- by flooding the network with traffic. Examples of DoS
ysis of the flow. In particular, a flow is defined as a attacks include botnet, Slowloris, smurf, and SYN flood
set of packets having a common set of properties (flow attacks.
keys), which may include IP addresses, port numbers, Distributed DoS (DDoS) [46] attacks are based on flood-
or meta-information [44]. Examples of flow-based fea- ing the server and making it unable to respond by over-
tures are statistical aggregations (e.g., average, maximum, loading it with service requests. Unlike in DoS attacks,
minimum) on the size, time of arrival, and number of the flooding is performed via many sources. Examples of
inbound/outbound packets in a given time period, the du- DDoS attacks include local area network denial (LAND),
ration of that period, and the type of packets. Connection- ping-of-death, RUDY, and teardrop attacks.
based features are related to a particular connection, User-to-Root (U2R) [45] attacks involve behaving as a
which is defined as a stream of packets between two normal user with the aim of detecting system vulnerabil-
specific IP addresses. Such features include the interval ities and gaining root access. Examples of U2R attacks
between packets, the timestamp, and the time to live. Fi- include buffer overflow, rootkit, Perl, and loadmodule
nally, packet-based features are related to the transmitted attacks.
data and include the payload and mean number of bytes Remote-to-Local (R2L) [45] attacks attempt to use a
of a packet. The main advantage of basic features is that remote system to gain unauthorized access to and damage
they are general and can be used to detect several kinds the target system. R2L attacks may be combined with
of attacks [45], [46]. U2R attacks, making these types of attacks difficult to
Traffic-based features are associated with either a specific differentiate. Examples of R2L attacks include Secure
time interval (e.g., 2 seconds) or a specific number of Shell (SSH) brute force, warezmaster, multihop, imap,
connections (e.g., 100 connections). These features can be and spy attacks.
extracted by considering either the same host or the same Probe [45] attacks are based on searching for vulnerabili-
service. In the first case, the extracted features include ties throughout the whole network by sending scan pack-
statistical sums of connections with the same destination ets and gaining information about the system. Examples
host, whereas in the second case, the extracted features of probe attacks include Satan, IP sweep, and port sweep
comprise statistical sums of connections to the same ser- attacks.
vice for a fixed amount of time or number of connections Password [18] attacks attempt to gain unauthorized access
[45]. One drawback of traffic-based features is that some to the system by using guessing techniques to steal
attack types span time intervals longer than 2 seconds passwords. Examples of password attacks include brute
or a number of connections greater than 100. Examples force FTP-Patator and brute force SSH-Patator attacks.
of such attack types include low-frequency attack types Injection [47] attacks use scripts that inject com-
such as U2R, R2L, and low-rate DoS attacks, in which the mands/queries with the purpose of gaining unauthorized
frequency of the transmitted information is similar to that access and stealing information. Examples of injection
of legitimate traffic, in contrast to high-frequency attack attacks include SQL injection and Cross-Site Scripting
types, which exhibit a higher frequency than normal (XSS).
traffic. Although some newly proposed connection-based
features span time intervals longer than 2 seconds, these Table III lists the attack types considered in the most fre-
features are not fully adequate for identifying such attack quently used benchmark datasets, along with their definitions.
patterns [45]. Although the definitions provided in Table III can be used to
Content-based features are extracted from information distinguish the different attacks, three additional factors must
embedded in different data portions of packets and in- be considered when designing an IDS. First, an attack of one
clude the number of requests, the request type, and the type may be the beginning of another attack of a different
number of failed login attempts. Content-based features type. In this case, the characteristics of the true attack will be
are especially useful for detecting low-frequency attack a combination of the characteristics of both attacks. Second,
types, which do not exhibit sequential patterns as high- some attack characteristics may evolve over time. For instance,
frequency attacks do. In fact, while traffic-based fea- DDoS attacks are mostly understood to be high-frequency
tures can be used to detect high-frequency attacks, low- attacks that flood the bandwidth of a network; however, DDoS
frequency attacks are difficult to detect using only basic attacks in the application layer are low-frequency attacks that
and traffic-based features, and in most cases, content- flood the server instead of flooding the network. Third, some
based features are also required [45]. attack types may show similar patterns. For example, both
DoS and probe attacks, in most cases, exhibit sequential
C. Attack Types patterns and involve a large number of connections to the
This section outlines the various attack types considered same host, whereas R2L and U2R attacks are both embedded
in IDSs. In particular, we present the following attack types, in packets. Therefore, although DoS and probe attacks are
since they are the ones considered in the most frequently used easy to differentiate from R2L and U2R attacks, it may not
benchmark datasets [48]: be as easy to differentiate DoS attacks from probe attacks or
IEEE SYSTEMS JOURNAL 5

TABLE III
ATTACK TYPES REPRESENTED IN THE MOST FREQUENTLY USED CYBERSECURITY BENCHMARK DATASETS
Attack name Examples Description
Temporarily blocks the normal use of network utilities by flooding the network
Denial of Service (DoS) [45] Botnet, Slowloris, smurf, SYN flood
with traffic.
Floods the server and makes it nonresponsive to users by overloading it with
Distributed DoS (DDoS) [46] LAND, ping of death, RUDY, teardrop service requests. Unlike in DoS attacks, the flooding originates from many
sources.
Behaves as a normal user with the aim of detecting system vulnerabilities and
User-to-Root (U2R) [45] Buffer overflow, rootkit, Perl, loadmodule
gaining root access.
Gains local access via a remote system and damages the system. May be
Remote-to-Local (R2L) [45] SSH brute force, warezmaster, multihop, imap, spy
combined with U2R attacks, thus making these attacks difficult to differentiate.
Searches for vulnerabilities throughout the whole network via IP addresses by
Probe [45] Satan, IP sweep, port sweep
sending scan packets and gaining information about the system.
Password [18] Brute force FTP-Patator, brute force SSH-Patator Gains access to the system after stealing passwords by guessing.
Uses a script to inject commands/queries to gain unauthorized access and steal
Injection [47] SQL injection, Cross-Site Scripting (XSS)
information.

U2R attacks from R2L attacks due to their similar embedding features and uncategorized samples. The dataset consists of
patterns. 155 features extracted using Wireshark [49].
To increase the effectiveness of differentiating among attack 2) CICIDS2017: This dataset was created from realistic
types, several studies have investigated which types of features traffic data at the Canadian Institute for Cybersecurity of the
are effective for detecting particular attack types. For example, University of New Brunswick (UNB) in 2017 and includes a
the authors of [18] report that on the basis of the features full-packet dataset with 152 features and raw PCAP files [50].
contained in the KDDCUP99 dataset, even though DoS attacks The dataset considers attacks and subattacks such as injection
can be differentiated using basic and traffic-based features, attacks based on SQL injection and XSS, password attacks
considering some sparse features, such as flags, destination IP based on brute force FTP-Patator and brute force SSH-Patator,
addresses, percentages of connections to the same service, and and flooding attacks based on DoS, Goldeneye DDoS, HULK
percentages of connections to the same port, can result in more DDoS, slow HTTP DDoS, Slowloris DDoS, and Heartbleed.
effective detection. Similarly, duration, service, destination Although the criteria for a reliable dataset proposed by [54]
host same service rate, and flag features are vital for detecting are satisfied, one feature among the attributes is duplicated.
probe (scanning) attacks. The most important features for 3) KDD99/KDDCup99: Also known as KDDCup99, the
detecting U2R attacks are the number of failed logins, number KDD99 dataset was created using DARPA 1998 PCAP files
of shells, number of roots, duration, and service. For R2L and includes full-packet data, divided into subsets for training
attacks, the most important features are the duration, service, and testing [51].
service bytes, destination bytes, number of failed logins, count, This dataset considers DoS-based subattacks such as back,
destination host count, and destination host service count. As LAND, ping of death, teardrop, Neptune, and smurf attacks;
seen above, the features used to detect attacks of the probe, U2R subattacks such as buffer overflow, loadmodule, Perl,
U2R, and R2L types show a high degree of similarity, which and rootkit attacks; R2L subattacks such as ftp-write, guess-
explains why these three attack types are often misclassified password, imap, multihop, PHF, spy, warezclient, and warez-
among each other. master attacks; and probe-based subattacks such as port sweep,
IP sweep, NMAP, and Satan attacks. As of 2019, this dataset
D. Benchmark Datasets remains the most widely used benchmark dataset in the field of
This section introduces and analyzes benchmark datasets network intrusion detection. However, this dataset suffers from
for intrusion detection, considering both the extent to which several limitations, including duplicated samples, different
they reflect novel attack types due to the evolving nature of probability distributions between the training and test data,
intrusion patterns over time and their shortcomings. For the unbalanced classes, and a lack of coverage of the newest attack
benchmark datasets considered in this section, Table IV lists types.
the most frequently used datasets, while Table V summarizes 4) NSL-KDD: This dataset was created by erasing all
the distribution of the samples in each dataset across the duplicate records from the KDD99 dataset and using sampling
different attack types considered. techniques to balance the number of data samples in each class
1) AWID2018: Also known as CSE-CIC-IDS2018, this [45]. This dataset includes separate databases for training and
dataset includes databases for training and testing collected testing, where the test database consists of fourteen subattack
using two different capture procedures. The data collected types that are not present in the training database. NSL-KDD
using the first procedure consist of full-packet network traffic is not subject to most of the limitations of the KDD99 dataset;
with system logs, while the data collected using the second however, this dataset still lacks newer attack types.
procedure consist of reduced packet traffic. The dataset in- 5) Kyoto: This dataset was created from honeypots at
cludes two different labels for attacks: a main attack label and Kyoto University and consists of traffic data collected daily
a subattack label. This dataset has the advantages of including between 2006 and 2015 [52]. The dataset includes 24 features,
the newest attack types, such as password attacks based on fourteen of which are in common with the KDD99 dataset, and
the SSH/FTP brute force approach, injection attacks based on labels indicating normal data, known attacks, and unknown
SQL injection, and flooding attacks based on DoS. However, attacks. The dataset is missing data from some days and
the data exhibit some limitations, such as noisy, misleading months during the time of its collection, and the average
IEEE SYSTEMS JOURNAL 6

TABLE IV
OVERVIEW OF THE MOST FREQUENTLY USED CYBERSECURITY BENCHMARK DATASETS
Ref. Name Year Num. of features Num. of samples Attack types Separate train-test sets
210900113 (full)
[49] AWID2018 2018 155 Flooding, impersonation, injection Yes
2326218 (reduced)
DoS/DDoS, port scan, FTP-Patator, SSH-
[50] CICIDS2017 2017 152 2830743 Patator, bot, web attacks, inﬁltration, Heart- No
bleed
4900000 (full)
[51] KDD99 1999 42 494021 (subset) DoS, probe, U2R, R2L Yes
311029 (testing)
125973 (training)
22544 (testing)
[45] NSL-KDD 2009 42 DoS, probe, U2R, R2L Yes
25192 (training)
11850 (testing)
[52] Kyoto 2006-2015 24 Various Known, unknown No
2540047 (full)
Fuzzers, worms, shellcode, analysis, back-
[53] UNSW-NB15 2015 49 175341 (training) Yes
doors, DoS, exploits, generic, reconnaissance
82332 (testing)
Notes: * = including 1 feature as a label; ** = including 2 features as labels; *** = at the time of this survey.

TABLE V
D ISTRIBUTIONS OF ATTACK TYPES IN THE MOST FREQUENTLY USED BENCHMARK DATASETS
Name AWID2018
Attack Normal Flooding Impersonation Injection
N. samples 205074514 1409392 2361892 2054315
(Perc.) (97.24%) (0.67%) (1.12%) (0.97%)
Name CICIDS2017
Attack Benign DoS DDoS Port scan FTP-P. SSH-P. Bot Web att. Infiltr. Heartb.
N. samples 2273097 252661 128027 158930 7938 5897 1966 2180 36 11
(Perc.) (80.3004%) (8.9257%) (4.5228%) (5.6144%) (0.2804%) (0.2083%) (0.0695%) (0.077%) (0.0012%) (0.0003%)
Name KDD99
Attack Normal DoS Probe U2R R2L
N. samples 972781 3683370 41102 52 1126
(Perc.) (20.71%) (78.4%) (0.8897%) (0.0001%) (0.0002%)
Name NSL-KDD
Attack Normal DoS Probe U2R R2L
N. samples 77054 53385 14077 252 3649
(Perc.) (51.9%) (35.9%) (9.5%) (0.2%) (2.5%)
Name Kyoto
Attack Normal Known attacks Unknown attacks
N. samples 1186780 11218206 563
(Perc.) (9.5706%) (90.429%) (0.0004%)
Name UNSW-NB15
Attack Fuzzers Worms Shellcode Analysis Backdoors DoS Exploits Generic Rec.
N. samples 24246 174 1511 2677 2329 16353 44525 215481 13987
(Perc.) (7.572%) (0.054%) (0.47%) (0.83%) (0.72%) (5.112%) (13.8%) (67.092%) (4.35%)
Notes: N. samples = Number of samples; Perc. = Percentage; FTP-P. = FTP-Patator; SSH-P. = SSH-Patator; Web att. = Web attacks; Infiltr. = Infiltration; Heartb. = Heartbleed;
Rec. = Reconnaissance. The largest remainder method was used when computing the percentages to ensure a total of .

number of samples per month is approximately twelve million. used intrusion detection datasets; however, it is commonly
Since the traffic was captured from honeypots, which are considered to be outdated and to contain irregularities [56].
designed to protect against less advanced attackers, most of the 8) ISCX IDS 2012: Also known as UNB or UNB ISCX
monitored attacks did not originate from advanced attackers. 2012, this dataset was created at UNB in 2012 and includes
Therefore, the dataset may not be representative of realistic full-packet network data [57]. The dataset includes normal
attacks. traffic data and attack data for attack types such as infil-
6) UNSW-NB15: This dataset was synthetically created at tration, DoS, DDoS, and brute force SSH attacks. Although
the Cyber Range Lab of the Australian Centre for Cybersecu- this dataset includes some of the newest attack types, it is
rity and includes full, training, and test datasets as well as criticized as being unrealistic for not containing sufficient
raw PCAP files. The dataset includes 49 features and two internet background noise, as it consists of pure network traffic
label attributes: the first label describes the attack, and the rather than data received by any real device [58].
second label is binary. The dataset considers attacks such as 9) CIC DoS: This dataset was created at the Canadian
fuzzers, backdoors, shellcode, DoS attacks, worms, generic Institute for Cybersecurity of UNB in 2017 [59]. It considers
attacks, reconnaissance attacks, exploits, and analysis attacks the application layer and incorporates data that describe high-
[53]. One of the limitations of this dataset is the existence of volume (traditional) DoS attacks, data corresponding to low-
several missing samples. volume DoS attacks, and normal data from the ISCX IDS 2012
7) DARPA: This dataset was created at the MIT Lincoln dataset.
Laboratory in 1998 and includes full, training, and test sets 10) Gure-KddCup: This dataset was created using the
of raw PCAP files [55]. The newer versions of the DARPA PCAP data from the DARPA 1998 dataset [60]. It includes
dataset, DARPA 1999 and DARPA 2000, are based on the features similar to those of the KDD99 dataset, with the
1998 version. This dataset is one of the most commonly addition of payload information and other new features, such
IEEE SYSTEMS JOURNAL 7

as IP addresses and port numbers, to make U2R and R2L as network telescope and DDoS databases [58], [70]. Although
attacks more visible/distinguishable [61]. there are a few up-to-date databases, such as CAIDA DDoS,
11) CDX: The Cyber Defence Exercises (CDX) dataset most do not accurately represent the different possible types
[62] was collected from the United States Military Academy of attacks. For instance, the DoS attack databases consist only
network in 2009 and consists of PCAP data extracted from of spoofed-source DoS attacks and exclude other versions of
system logs, divided into intrusion traffic and normal traffic DoS attacks.
[56]. 20) DEFCON: The DEFCON datasets are created for in-
12) ASNM-CDX: This dataset was created from the CDX trusion modeling competitions held every year. Although these
network traffic data in 2009. The dataset includes 5772 datasets are continuously created, they focus only on intrusions
samples, each with 875+1+1 features. It includes distributed and attacks and lack normal background traffic [58]. Therefore,
features often used in detecting low-frequency attacks, such they are not frequently used for network intrusion detection.
as the number of packets and the total bytes in/out from four 21) Others: In addition to the most commonly used bench-
seconds to fifty-four seconds. In some cases, the features have mark datasets, a variety of publicly available raw traffic
been converted with the fast Fourier transform to increase datasets exist. These datasets include Metrosec, UNIBS 2009,
their discriminative ability. This dataset has two attack label TUIDS, the University of Napoli traffic dataset, payload
attributes: the first label discriminates between legitimate and datasets such as the CSIC 2010 HTTP Dataset, the UNM
malicious traffic, and the second label indicates whether the system call dataset, and an enormous variety of network traffic
attack is based on buffer overflow. However, this dataset lacks from the Capture the Flag Competitions (CTF) and CDX.
traffic diversity since it consists only of buffer overflow attacks Moreover, several host-based datasets also exist, including
[63]. the ADFA Linux Dataset (ADFA-LD), the ADFA Windows
13) LBNL: This dataset was created at the Lawrence Berke- Dataset (ADFA-WD) and the ADFA Windows Dataset Stealth
ley National Laboratory (LBNL) between 2004 and 2005. Attacks Addendum (ADFA-WD:SAA) [71].
Although the dataset includes packet headers, the payloads
are anonymized due to privacy issues, which limits its infor- III. DL- BASED I NTRUSION D ETECTION M ETHODS
mativeness [64]. Traditional ML-based methods for cybersecurity include
14) ISOT: This dataset was created in 2010 by combining approaches based on the k-Nearest Neighbor (k-NN) algo-
Storm, Waledac, and Zeus botnet attack data from the French rithm, k-means clustering, Artificial Neural Networks (ANNs),
Chapter of the Honeynet Project and normal traffic data from fuzzy logic, Bayesian networks, hidden Markov models, self-
the Traffic Lab at Ericsson Research and LBNL [65]. organizing maps, decision trees, evolutionary classifiers, Sup-
15) MAWI: This dataset was collected by the MAWI Work- port Vector Machines (SVMs), and rule-based systems [17],
ing Group in Japan and includes continuously updated traffic [18], [22], [26]. In this survey, we focus on the more recent
data from 2001 to 2019. A graph-based methodology has been DL-based approaches, which have not been covered in detail
used to label the raw data as either abnormal or normal [66]. in previous surveys.
One of the limitations of this dataset is duplicated packets. To provide up-to-date descriptions of the recent methods
16) CTU-13: This dataset is a combination of botnet traffic developed for cybersecurity, this section describes DL-based
data, normal data, and background data collected at Czech methods for intrusion detection. For each algorithm, we
Technical University in Prague (CTU) in 2011. Although the consider evaluation criteria such as a fast run/convergence
data consist of a variety of botnet scenarios and extended time, a high detection ability with a low false positive rate,
truncated versions of PCAP files with complete TCP, UDP and adaptability to novel intrusions, computational efficiency, and
Internet Control Message Protocol (ICMP) headers, the dataset scalability [16]. In the remainder of this section, we consider
is specifically designed only for botnet detection. Therefore, DL methods based on DBNs, AEs, CNNs, LSTM networks,
it is considered unrealistic to mix these data with normal and and GANs [15]. A summary of the presented DL methods in
background traffic [67]. the IDS context is presented in Table VI.
17) UMass: This dataset was collected between 2004 and
2018 and contains traffic data such as Tor traffic data, Gateway A. Deep Belief Networks (DBNs)
Link 3 Trace data, web requests, and response data. However, DBNs are a type of ANN obtained by stacking together
most of the data were collected under similar network traffic several Restricted Boltzmann Machines (RBMs [77]), which
conditions and lack a broad variety of attacks [68]. act as the layers of the DBN, and introducing connections
18) Twente: This dataset was created from honeypots at between the layers but not within each layer. The RBMs used
the University of Twente in 2009 and consists of more than to construct a DBN consist of two main layers, one visible
fourteen million flows and more than seven million alerts. In and one hidden, constituted by a variable number of neurons.
this dataset, some samples are left unlabeled, and informative Additionally, within each RBM, the neurons of different layers
data from the packet headers and payloads are anonymized are fully connected, whereas the connections within the same
[69]. This dataset has the limitation that traffic originating from layer are restricted [72]. Fig. 1 shows an example of a DBN.
honeypots does not represent realistic attacks since honeypots Because of their layered structure, DBNs have the advantage
are designed to protect against less advanced attackers. that fast learning procedures can be used, which can be applied
19) CAIDA: The CAIDA dataset consists of a variety of in a greedy fashion, layer by layer, in an unsupervised way
different databases that are specific to particular events, such [78]. As a consequence of this advantage, methods based
IEEE SYSTEMS JOURNAL 8

TABLE VI
S UMMARY OF DL- BASED METHODS FOR INTRUSION DETECTION
Method Description Pros Cons
Deep Belief Stacks of Restricted Boltzmann Machines (RBMs) with Fast and unsupervised layer-by-layer learning in
Training uses an approximation of
Networks (DBNs) connections between the layers but not within each a greedy fashion. Unsupervised dimensionality
the gradient.
[72] layer. reduction.
Can be trained in an end-to-end manner using
Autoencoders Encoder-decoder structure that maps input data to a Requires an additional ML model
learning algorithms based on gradient descent.
(AEs) [73] hidden space and then reconstructs them. to perform classiﬁcation.
Unsupervised dimensionality reduction.
Convolutional Performs classiﬁcation while automatically learn- Computationally expensive to train.
Sequences of convolutional layers trained via gradient
Neural Networks ing data representations. Learns discriminant spa- Not naturally suited to processing
descent.
(CNNs) [74] tial patterns invariant to translation and shifting. data in time-series form.
Long Short-Term The research community is increas-
Neurons arranged in a temporal sequence, able to
Memory Can natively process time-series data. ingly focusing on CNNs rather than
maintain memory for arbitrary intervals of time.
(LSTM) [75] LSTM networks.
Generative Adv. Combination of a generator, which generates data start-
Learns data distributions in an unsupervised man- Often requires visual inspection of
Networks (GANs) ing from a random distribution, and a discriminator,
ner. the results.
[76] which distinguishes real data from synthetic data.

Fig. 1. Example of a Deep Belief Network (DBN). DBNs are obtained by

stacking together several Restricted Boltzmann Machines (RBMs), which act
as the layers of the DBN. In DBNs and RBMs, neurons of different layers
are fully connected, while connections within the same layer are restricted. Fig. 2. Example of an Autoencoder (AE). The same data are used as both
input and target: the encoder maps the input data to a hidden space in a
nonlinear manner, and the decoder reconstructs the input data by mapping
on DBNs were among the first DL-based approaches studied the encoded data back to the original input space.
for intrusion detection. In addition, the ability to train DBNs
B. Autoencoders (AEs)
using fast and unsupervised learning algorithms makes them
particularly suitable for performing a preliminary dimension- AEs are a type of ANN used to learn and reconstruct a
ality reduction step, with the aim of extracting a compact and representation of input data. In AEs, the same data are used as
discriminant representation of the data, without the need for both input and target, with the purpose of learning a model that
labels, even in the case of large intrusion detection databases. can extract a compact and discriminant representation of the
For example, in the method described in [79], proposed in input data in an unsupervised manner. Such a representation
2011, a DBN was first applied to perform a feature reduction can then be used as input to a classifier to perform detection.
step, and an SVM was then used to classify the intrusions An AE consists of two components: an encoder and a decoder.
contained in the NSL-KDD dataset. Similarly, the approach The encoder maps the input data to a hidden space in a
proposed in [80] is based on a DBN, the parameters of nonlinear manner, and the decoder reconstructs the input data
which are first optimized via particle swarm optimization to by mapping the encoded data back to the original input space.
map the input data to a space of reduced dimensionality. The purpose of the decoder is to minimize the reconstruction
Then, a probabilistic neural network is trained to perform error, defined as the difference between the input data and the
classification. reconstructed data [73]. Fig. 2 shows an example of an AE.
More recent methods have introduced ML architectures Because of their ability to extract a compact and highly dis-
based on deeper DBNs, such as the approaches described in criminative representation with reduced dimensionality from
[81]–[83], which have achieved improved detection accuracy input data, AEs are often used as a preprocessing step in
on the NSL-KDD and KDD99 datasets. intrusion detection. In most cases, the related approaches
presented in the literature involve using AEs to preprocess the
DBNs have the drawback that it is computationally unfeasi- input data, followed by the application of an ML classifier. In
ble to train them end to end in a supervised way using gradient contrast to DBNs, AEs can be trained in an end-to-end manner
descent methods. Due to this drawback, in most cases, DBNs using learning algorithms based on gradient descent, without
are trained using training algorithms based on contrastive the need to approximate the gradient [15]. For example, in the
divergence, which rely on an approximation of the gradient method described in [86], an AE with seven layers is used to
[84]. Recently, however, the growing availability of computing obtain a compact and discriminant representation of the input
power and GPU-based training architectures (e.g., CUDA [85]) data. On the NSL-KDD dataset, this method achieves superior
has made it possible to train DL models using end-to-end detection accuracy compared with other dimensionality reduc-
learning algorithms based on gradient descent, without the tion methods based on principal component analysis (PCA)
need to approximate the gradient [15]. Examples of recent and kernel PCA. Its main limitation is the lack of information
DL models trained end to end include AEs, CNNs, LSTMs, about the shallow classifiers used to classify the data after the
and GANs.
IEEE SYSTEMS JOURNAL 9

dimensionality reduction phase. Similar to that in [86], the

method proposed in [87] involves applying an AE to the input
data to extract features, which are then classified using shallow
classifiers such as naive Bayes, k-NN, and SVM classifiers.
This work considers the NSL-KDD dataset and reports higher
accuracy than that achieved by previous methods, particularly
when using the naive Bayes classifier. Fig. 3. Example of a convolutional neural network (CNN). The layers perform
Several methods in the literature combine an AE with a subsequent processing by convolving the data with banks of filters.
density estimation model to achieve greater detection accuracy. the method described in [98] combines VAEs with a gradient-
For example, the method proposed in [88] adopts a combined based fingerprinting detection model. In this method, gradient-
approach based on an AE and density estimation. This method based fingerprints are first extracted from NetFlow data taken
achieves a high detection accuracy on the NSL-KDD dataset, from the UGR16 dataset [99], and VAEs are then applied
especially for DoS and probe attacks. The method described for feature reduction. The work reported in [100] presents
in [89] extends the previous method by combining an AE with a more comprehensive evaluation conducted by combining
a Gaussian mixture model to perform intrusion detection. The VAEs with several classifiers, such as naive Bayes, SVM,
model consists of an estimation network, which evaluates the decision tree, and random forest classifiers. This method
densities of the samples in a low-dimensional space, and a achieves good results on the NSL-KDD and UNSW-NB15
compression network, which projects the data into a lower- datasets, especially when using the decision tree and random
dimensional space. A procedure based on joint parameter forest classifiers.
optimization is used to update the model parameters. On the Despite the ability of AE-based methods to automatically
KDD99 dataset, this method achieves a significant accuracy obtain a compact and discriminant feature representation that
improvement compared to baseline methods using pretrained can be adapted to the input data, AEs have the drawback of re-
AEs. quiring an additional ML model to perform classification based
A variant of AEs is represented by sparse AEs, which on the obtained feature representation. To compensate for this
use a sparsity constraint to further reduce the dimensionality drawback, the use of CNNs is increasingly being considered in
of the obtained representation [73]. Specifically, the method recent DL-based methods developed for IDSs. This is because
described in [90] uses a sparse AE combined with a soft- CNNs can be trained to process input data to automatically
max regression classifier to perform intrusion detection. The learn a compact and discriminant representation [15] while
method achieves higher accuracy than previous models on simultaneously classifying the obtained representation into the
the NSL-KDD dataset but considers only binary classification, corresponding attack types [21].
differentiating between normal and anomalous traffic.
As the available computational power has increased, recent C. Convolutional Neural Networks (CNNs)
methods based on AEs have also considered Stacked AEs CNNs are a type of ANN in which the layers are structured
(SAEs) [91], which consist of several AEs trained separately to process input data in the form of multidimensional signals
and then “stacked” to obtain a deeper model and a more such as images or three-dimensional volumes. The different
discriminant representation. The method proposed in [92] uses layers in a CNN perform subsequent processing by convolving
a model based on an SAE to preprocess raw traffic data the data with banks of filters, with parameters typically learned
in the CTU-13 dataset. Similarly, the method proposed in via gradient descent. CNNs have the main advantage of
[93] uses an SAE to process traffic data captured from home performing classification while automatically learning data
wireless networks, representing several types of DDoS attacks. representations, without the need for a handcrafted feature
To improve the detection accuracy, the method introduced in extraction step [74], [101]. Due to this advantage, CNNs have
[94] combines an SAE with a random forest classifier. The been successfully applied in several scenarios [15], [102]–
method has been tested on the KDD99 and NSL-KDD datasets [105]. Fig. 3 shows an example of a CNN.
by reducing the feature dimensionality of the input data and In addition to their advantage of automatically learning
performing five-class classification. The method achieves high data representations, CNNs have been especially successful in
overall detection accuracy but exhibits low accuracy in detect- processing data with the aim of learning discriminant spatial
ing U2R and R2L attacks. Similarly, the method described in patterns among the features that are invariant to translation and
[95] achieves high detection accuracy on the KDD99 dataset shifting [101]. In the context of IDSs, the advantages of CNNs
based on a combination of four AEs. Instead of using a random have been useful for classifying attack types by considering
forest classifier, the method presented in [96] combines SAEs the relationships among the features while requiring minimal
with a radial basis function classifier to achieve high detection preprocessing of the data [21].
accuracy on the AWID2018 dataset. Recently, however, the vast majority of CNN architectures
Recently, one of the most commonly used types of AE have been structured to process data in the form of images
models has been Variational AEs (VAEs) [97]. Their main [15]. Therefore, to use a CNN for intrusion detection, a prepro-
novelty is that, whereas AEs use a deterministic discriminative cessing step must be performed to transform the features into
model, VAEs use a probabilistic generative model to recon- a two-dimensional format that can be processed by the CNN.
struct the input data. As a consequence, VAEs are less prone Several methods have been proposed for transforming features
to overfitting than AEs are. In the field of intrusion detection, into a two-dimensional format. For example, the preprocessing
IEEE SYSTEMS JOURNAL 10

method proposed in [106] converts feature attributes into

binary vectors. The method converts symbolic attributes, such
as flag, service, and protocol type attributes, into binary vectors
using one-hot encoding [107]. Then, continuous attributes are
converted by performing min-max normalization, discretizing
the normalized values into ten intervals, and applying one-
hot encoding. Finally, the obtained vectors are combined and
reshaped to form a two-dimensional image. Similarly, the
preprocessing approach presented in [108] converts malware Fig. 4. Example of a Long Short-Term Memory (LSTM) network. The
neurons are connected following a temporal sequence. The forget, input, and
binaries into grayscale images. A malware file is first read as output gates control which information is preserved in the network and passed
a vector of 8-bit binary numbers, and each binary number is to the next time step.
then converted into its equivalent decimal value. Finally, the primarily useful for learning discriminant spatial patterns from
resulting decimal vector is reshaped into a two-dimensional input data, whereas they are not naturally suited for processing
grayscale image. data in the form of time series with the intent of learning
Recent preprocessing methods for CNN-based intrusion discriminant temporal patterns. To overcome this disadvantage,
detection have extended grayscale image representations to several recent methods have considered the use of LSTM net-
consider multiple channels. For example, the new encoding works, which are specifically structured for learning temporal
method proposed in [109] is designed to give equal weight to patterns by processing new data while maintaining a memory
each feature, producing a feature representation with 24 bits of previous samples [75].
for each pixel, similar to an RGB color image.
To accelerate the transformation process, some methods D. Long Short-Term Memory (LSTM) Networks
involve performing feature selection before converting the An LSTM network is a type of ANN based on an RNN
data into an image-based format. For example, in the method in which the neurons are connected following a temporal
described in [110], feature selection is applied via a genetic sequence. However, in contrast to traditional RNNs, LSTM
algorithm. networks have a deeper structure of hidden neurons with the
After the features are transformed into an image-based ability to maintain a memory of previous inputs for arbitrary
format, a CNN-based approach is applied to classify the intervals of time [75]. Due to this node arrangement, RNNs
obtained images and perform intrusion detection. Several such and LSTM networks are often used to process data in the form
approaches have been proposed in the literature. Most of of time series [123]. Fig. 4 shows an example of an LSTM
these CNN-based methods rely on a layer structure based network.
on an existing architecture. In particular, several methods use The ability of LSTM networks to process time-series data
architectures based on LeNet [74], ResNet [111], GoogLeNet has proven useful in the IDS context since datasets for cyberse-
[112], or VGG-16 [113]. The LeNet architecture is used in curity and intrusion detection are often structured as sequences
the methods described in [114]–[118], which achieve high of features evolving over time. Due to this advantage, several
detection accuracy, especially on the AWID2018 dataset [119]. intrusion detection methods in the literature are based on
Similarly, the ResNet and GoogLeNet architectures are used LSTM networks [124]. Among these methods, the approach
in the method proposed in [106], which has been tested on the proposed in [125] applies a three-layer LSTM network. It
NSL-KDD dataset after preprocessing. Although satisfactory achieves high detection accuracies on the KDD99, ADFA-LD,
results are achieved on this dataset, detection rates are not and UNM datasets. Similarly, the method proposed in [126]
reported for each class. The VGG-16 architecture is used in uses a cascade of three LSTM network modules, combined
the method presented in [108], which has been applied to using a voting mechanism, to achieve an increased intrusion
the Malign Dataset and the Microsoft Malware Dataset. This detection accuracy.
method achieves almost the same accuracy as the winner of To exploit both the accuracy of LSTM networks in pro-
the Microsoft Malware Dataset Challenge [120]. cessing time series and the capability of CNNs to extract
In addition to methods based on existing networks, there are spatial patterns from images, recent methods have increasingly
some CNN-based methods for which innovative ML architec- considered combinations of LSTM and CNN architectures
tures have been proposed. For example, the authors of [121] for intrusion detection. For example, the method proposed
propose a novel CNN and present its application to the KDD99 in [127] uses an LSTM network combined with a CNN
dataset. The method achieves a high classification accuracy to perform multiclass detection of anomalies in the KDD99
for five types of DoS attacks, exhibiting performance superior dataset. Similarly, the approach described in [128] uses both a
to that of naive Bayes and k-NN classifiers. An innovative CNN and a hybrid LSTM-CNN model to perform detection. In
architecture is also proposed in [122] based on a convolutional some cases, hybrid LSTM-CNN models have been developed
AE; however, this method has been tested only on a private based on existing architectures, such as the method developed
dataset. in [129], which relies on a CNN designed based on the LeNet
Methods based on CNNs have recently achieved high accu- model. This method has been tested on recent databases,
racy on several intrusion detection datasets due to their ability including CICIDS2017 and CTU-13.
to simultaneously learn a compact representation of the input Despite the ability of LSTM networks to natively process
data and perform adaptive classification. However, CNNs are time-series data, the introduction of novel and advanced CNN
IEEE SYSTEMS JOURNAL 11

to perform compared with cases in which a GAN is used to

generate images of known objects or people.
IV. D ISCUSSION
This section presents a discussion of the limitations, chal-
lenges, and research trends of the current databases and in-
trusion detection approaches for cybersecurity applications. In
Fig. 5. Example of a Generative Adversarial Network (GAN), composed of particular, we will focus on the issue of dataset reliability and
a generator, which generates data starting from a random distribution, and a on research directions regarding novel features for intrusion
discriminator, which distinguishes real data from synthetic data. The generator detection.
and discriminator are trained in an alternating fashion and, in most recent
architectures, have CNN-based structures. A. Dataset Reliability
architectures is shifting the attention of the research commu- The recent rise in the number of ML-based approaches,
nity towards the use of CNNs in a wider range of application particularly those based on DL, has resulted in an increase in
scenarios, including the learning of temporal patterns [130]. the accuracy of intrusion detection that can be achieved using
In fact, current research trends are increasingly focusing on state-of-the-art methodologies. However, the performance of
CNN architectures that are deeper (e.g., ResNet) [111] or DL-based methods strongly depends on the quantity and
lighter in weight (e.g., MobileNet [131]) and on computing quality of the data available [15], with the consequence that the
platforms specifically designed to accelerate the training of biases and limitations of the datasets used to train the models
such architectures [85]. Consequently, CNN-based methods directly affect the reliability of the predictions.
tend to outperform models based on recurrent architectures, Currently, although the majority of works focus on research-
such as LSTM-based models, in most cases [132]. ing algorithms that can yield improved detection results, a
few studies have been dedicated to evaluating the reliability of
E. Generative Adversarial Networks (GANs) benchmark datasets. As a first step towards evaluating dataset
reliability, the work proposed in [137] discusses the criteria
GANs are DL models that can learn and mimic the dis- for a reliable benchmark dataset, which concern the diversity
tribution of input data to generate synthetic samples with a of the traffic data, the diversity of the protocols, the volume
strong resemblance to the original data. Specifically, a GAN of collected data, the diversity of the attacks considered, the
is structured as a combination of a generator, which generates inclusion of novel attack types, the inclusion of full payloads
data starting from a random distribution, and a discriminator, without anonymization, the presence or absence of informative
which distinguishes real data from synthetic data. The genera- features, the updatability, the consideration of realistic traffic,
tor and discriminator are trained in an alternating fashion until the extent of labeling, and the size of the feature set. Finally,
equilibrium is reached [76], [133]. In most recent applications, any discussion of dataset reliability should consider the ability
the generator and discriminator of the GAN are structured as of a dataset to adapt to changes over time, for example,
CNNs, with the consequence that recent GANs can generate by mimicking statistically normal traffic in accordance with
synthetic image samples with a high degree of realism [134]. upcoming needs.
Fig. 5 shows an example of a GAN. Similarly, the work described in [54] proposes eleven cri-
GANs have the main advantage of being able to learn the teria for assessing the reliability of a dataset for intrusion
distribution of the input data in an unsupervised manner, that detection: i) attack diversity, ii) anonymity, iii) available
is, without requiring class labels. In the IDS context, this protocols, iv) complete capture (with payloads), v) complete
characteristic is useful for learning the characteristics of data interaction, vi) complete network configuration, vii) complete
distributions in specific situations (e.g., under normal condi- traffic, viii) feature set, ix) heterogeneity (all network traffic
tions). Due to this advantage, many recent methods developed and system logs), x) correct labeling, and xi) metadata (full
for IDSs use GANs trained on existing datasets to detect documentation of data collection). In addition to complying
anomalies, where the training data include only data captured with these criteria, a reliable dataset should also provide a
in specific situations. Among these methods, a CNN-based means of anonymizing the payload information to guarantee
GAN is introduced in [135] to learn the characteristics of data users’ privacy [9].
captured under normal conditions. Then, the method is used to Among the criteria considered in [54], [137], attack and
detect anomalies by computing the distance between freshly traffic diversity play a major role, since a limited diversity or
captured data and normal data. To achieve faster detection, an a high imbalance among attack types might increase the bias
algorithm is proposed in [136] that improves the computational of detection approaches towards specific situations. To limit
efficiency of the GAN described in [135] while achieving a the problem of model bias and enable accurate evaluation of
similar accuracy on the KDD99 dataset. detection algorithms, it is therefore crucial to consider datasets
Despite the ability of GANs to simulate input data distribu- that are as free as possible from internal biases while also
tions, the synthetic data generated from the learned distribution being sufficiently representative of real-world data. However,
may be insufficiently realistic compared with the real data, dataset bias has been considered mainly for the benchmark
and thus, manual (e.g., visual) inspection may be required datasets used in the field of computer vision [138], [139],
to achieve good results. In the case of IDSs, such visual whereas there is no analysis in the literature of the bias
examinations of the feature vectors could be relatively difficult of benchmark datasets for intrusion detection. Therefore, an
IEEE SYSTEMS JOURNAL 12

evaluation of dataset bias for IDSs may contribute to a fairer [3] “Hacked consumers don’t forgive companies who lose their
assessment of the various algorithms that have been proposed data. bad news for yahoo,” https://secludit.com/en/blog/
consumer-hacking-confidence.
in the field of cybersecurity. [4] McAfee, “Mcafee labs threats report,” https://www.mcafee.com/
In addition to dataset bias, a few works in the literature enterprise/en-us/assets/reports/rp-quarterly-threats-aug-2019.pdf,
address other issues related to public benchmark datasets, such 2019.
[5] R. Bhadoria, “Security architecture for cloud computing,” in Cyber Se-
as repeated data, missing values, incorrect labeling [137], or an curity and Threats: Concepts, Methodologies, Tools, and Applications,
optimistic number of false alarms due to considering specific 2018, pp. 729–755.
situations in a nonrealistic way [140]. [6] M. Swarnkar and R. Bhadoria, “Security aspects in utility computing,”
in Emerging Research Surrounding Power Consumption and Perfor-
B. Novel Features mance Issues in Utility Computing, 2016, pp. 262–275.
[7] S. Dorbala and R. Bhadoria, “Analysis for security attacks in cyber-
As the number of methodologies that are able to achieve physical systems,” in Cyber-Physical Systems: A Computational Per-
high accuracy on known datasets increases, attack patterns spective, 2015, pp. 395–414.
[8] S. K. Khaitan and J. D. McCalley, “Design techniques and applications
tend to evolve to better cheat the existing IDSs. This evolution, of cyberphysical systems: A survey,” IEEE Syst. J., vol. 9, no. 2, pp.
which can arise in nonstationary environments, is known as 350–365, 2015.
concept shift and occurs as the definitions of attacks change [9] R. Sandhu and P. Samarati, “Authentication, access control and intru-
sion detection,” in CRC Handbook of Computer Science and Engineer-
over time [141]. ing. CRC Press Inc., 1997, pp. 1929–1948.
For instance, the work presented in [142] shows that some [10] S. Han, M. Xie, H. Chen, and Y. Ling, “Intrusion detection in cyber-
low-frequency DDoS attacks that appear in newer datasets physical systems: Techniques and challenges,” IEEE Syst. J., vol. 8,
no. 4, pp. 1052–1062, 2014.
exhibit a higher degree of similarity to normal data traffic [11] T. T. T. Nguyen and G. Armitage, “A survey of techniques for internet
than do similar attacks in older datasets. As a consequence, in traffic classification using machine learning,” IEEE Commun. Surveys
recent cases, some features are less effective in detecting such Tuts., vol. 10, no. 4, pp. 56–76, 2008.
[12] R. Donida Labati, A. Genovese, V. Piuri, F. Scotti, and S. Vishwakarma,
attacks than they are in detecting older attack patterns. “Computational intelligence in cloud computing,” in Recent Advances
Therefore, it remains an open research issue to investigate in Intelligent Engineering: Volume Dedicated to Imre J. Rudas’ Sev-
whether the available features in known benchmark datasets entieth Birthday. Springer, 2020, pp. 111–127.
[13] Y. Cai, A. Genovese, V. Piuri, F. Scotti, and M. Siegel, “IoT-based
are sufficient to achieve high detection rates even in the architectures for sensing and local data processing in ambient intelli-
presence of changing attack patterns or whether it will be gence: Research and industrial trends,” in Proc. of I2MTC, 2019.
necessary to add new features to maintain a high level of [14] Z. M. Fadlullah, F. Tang, B. Mao, N. Kato, O. Akashi, T. Inoue, and
K. Mizutani, “State-of-the-art Deep Learning: Evolving machine intel-
detection accuracy. ligence toward tomorrow’s intelligent network traffic control systems,”
V. C ONCLUSION IEEE Commun. Surveys Tuts., vol. 19, no. 4, pp. 2432–2455, 2017.
[15] S. Pouyanfar, S. Sadiq, Y. Yan, H. Tian, Y. Tao, M. P. Reyes, M.-
In this review, we have analyzed Machine Learning (ML)- L. Shyu, S.-C. Chen, and S. S. Iyengar, “A survey on deep learning:
based approaches to cybersecurity and intrusion detection sys- Algorithms, techniques, and applications,” ACM Comput. Surv., vol. 51,
tems, with a specific focus on the most recent methods based no. 5, 2018.
[16] M. H. Bhuyan, D. K. Bhattacharyya, and J. K. Kalita, “Network
on Deep Learning (DL), which represent the current state of anomaly detection: Methods, systems and tools,” IEEE Commun.
the art for intrusion detection in network traffic. Specifically, Surveys Tuts., vol. 16, no. 1, pp. 303–336, 2014.
we have considered methods based on deep belief networks, [17] A. L. Buczak and E. Guven, “A survey of data mining and machine
learning methods for cyber security intrusion detection,” IEEE Com-
autoencoders, convolutional neural networks, long short-term mun. Surveys Tuts., vol. 18, no. 2, pp. 1153–1176, 2016.
memory networks, and generative adversarial networks. In [18] P. Mishra, V. Varadharajan, U. Tupakula, and E. S. Pilli, “A detailed
contrast to previous surveys, this review considers studies that investigation and analysis of using machine learning techniques for
intrusion detection,” IEEE Commun. Surveys Tuts., vol. 21, no. 1, pp.
use common benchmark datasets to ensure a fair evaluation 686–728, 2019.
and comparison of the proposed algorithms. [19] D. Kwon, H. Kim, J. Kim, S. C. Suh, I. Kim, and K. J. Kim, “A survey
To provide a reference for how recent cybersecurity methods of Deep Learning-based network anomaly detection,” Cluster Comput.,
vol. 22, pp. 949–961, 2017.
use benchmark datasets for intrusion detection, in this survey, [20] E. Hodo, X. J. A. Bellekens, A. W. Hamilton, C. Tachtatzis, and R. C.
we have also reviewed the main datasets used for this purpose Atkinson, “Shallow and Deep networks intrusion detection system: A
by highlighting their potential for training effective ML- taxonomy and survey,” ArXiv, vol. abs/1701.02145, 2017.
[21] Y. Xin, L. Kong, Z. Liu, Y. Chen, Y. Li, H. Zhu, M. Gao, H. Hou,
based algorithms. In particular, we have considered the data and C. Wang, “Machine learning and Deep Learning methods for
collection procedures, the distributions of feature and attack cybersecurity,” IEEE Access, vol. 6, pp. 35 365–35 381, 2018.
types, and dataset reliability criteria. [22] H. Hindy, D. Brosset, E. Bayne, A. Seeam, C. Tachtatzis, R. C.
Atkinson, and X. J. A. Bellekens, “A taxonomy and survey of intrusion
By providing a survey of ML and DL approaches, along detection system design techniques, network threats and datasets,”
with descriptions of the benchmark datasets considered when CoRR, vol. abs/1806.03517, 2018.
developing recent methods, this review aims to provide a [23] A. Praseed and P. S. Thilagam, “DDoS attacks at the application
layer: Challenges and research perspectives for safeguarding web
practical road map for researchers in academia and industry applications,” IEEE Commun. Surveys Tuts., vol. 21, no. 1, pp. 661–
working in the field of ML and DL for cybersecurity applica- 685, 2019.
tions. [24] B. B. Zarpelo, R. S. Miani, C. T. Kawakani, and S. C. de Alvarenga, “A
survey of intrusion detection in Internet of Things,” J. Netw. Comput.
R EFERENCES Appl., vol. 84, no. C, pp. 25–37, 2017.
[25] C. Tsai, C. Lai, M. Chiang, and L. T. Yang, “Data mining for internet
[1] S. Muggleton, “Alan Turing and the development of artificial intelli- of things: A survey,” IEEE Commun. Surveys Tuts., vol. 16, no. 1, pp.
gence,” AI Commun., vol. 27, no. 1, pp. 3–10, 2014. 77–97, 2014.
[2] “WannaCry ransomware attack,” https://en.wikipedia.org/wiki/ [26] A. Nisioti, A. Mylonas, P. D. Yoo, and V. Katos, “From intrusion de-
WannaCry_ransomware_attack. tection to attacker attribution: A comprehensive survey of unsupervised
methods,” IEEE Commun. Surveys Tuts., vol. 20, no. 4, 2018.
IEEE SYSTEMS JOURNAL 13

[27] R. Abdulhammed, M. Faezipour, and K. M. Elleithy, “Network intru- [61] I. Perona, I. Gurrutxaga, O. Arbelaitz, J. I. Martín, J. Muguerza,
sion detection using hardware techniques: A review,” in Proc. of LISAT, and J. M. Pérez, “Service-independent payload analysis to improve
2016, pp. 1–7. intrusion detection in network traffic,” in Proc. of AusDM, 2008.
[28] J. Kim, P. J. Bentley, U. Aickelin, J. Greensmith, G. Tedesco, and [62] National Security Agency, “Cyber Defense Exercise (CDX),” https:
J. Twycross, “Immune system approaches to intrusion detection – a //apps.nsa.gov/iaarchive/programs/cyber-defense-exercise/index.cfm,
review,” Nat. Comput., vol. 6, no. 4, pp. 413–466, 2007. 2001.
[29] A. Volkova, M. Niedermeier, R. Basmadjian, and H. de Meer, “Security [63] I. Homoliak, M. Barabas, P. Chmelar, M. Drozd, and P. Hanacek,
challenges in control network protocols: A survey,” IEEE Commun. “ASNM: Advanced security network metrics for attack vector descrip-
Surveys Tuts., vol. 21, no. 1, pp. 619–639, 2019. tion,” in Proc. of SAM, 2013.
[30] O. Savas and J. Deng, Big Data Analytics in Cybersecurity. Auerbach [64] R. Pang, M. Allman, V. Paxson, and J. Lee, “The devil and packet trace
Publications, 2017. anonymization,” SIGCOMM Comput. Commun. Rev., vol. 36, no. 1, pp.
[31] F. Pacheco, E. Exposito, M. Gineste, C. Baudoin, and J. Aguilar, 29–38, 2006.
“Towards the deployment of machine learning solutions in network [65] S. Saad, I. Traore, A. Ghorbani, B. Sayed, D. Zhao, W. Lu, J. Felix,
traffic classification: A systematic survey,” IEEE Commun. Surveys and P. Hakimian, “Detecting P2P botnets through network behavior
Tuts., vol. 21, no. 2, pp. 1988–2014, 2019. analysis and machine learning,” in Proc. of PST, 2011, pp. 174–180.
[32] “LibPCAP,” https://www.tcpdump.org. [66] R. Fontugne, P. Borgnat, P. Abry, and K. Fukuda, “MAWILab: Com-
[33] “WinPCAP,” https://www.winpcap.org. bining diverse anomaly detectors for automated anomaly labeling and
[34] “Snort,” https://www.snort.org. performance benchmarking,” in Proc. of CoNEXT, 2010.
[35] “Wireshark,” https://www.wireshark.org. [67] S. García, M. Grill, J. Stiborek, and A. Zunino, “An empirical com-
[36] “tshark,” https://www.wireshark.org/docs/man-pages/tshark.html. parison of botnet detection methods,” Comput. Secur., vol. 45, 2014.
[37] “TCPDump,” https://www.tcpdump.org. [68] University of Massachusetts Amherst - Laboratory for Advanced Soft-
[38] “Networkminer,” https://www.netresec.com/?page=NetworkMiner. ware Systems, “UMassTraceRepository,” http://traces.cs.umass.edu/
[39] “Rapidminer,” https://rapidminer.com. index.php/Network/Network, 2018.
[40] “Scapy,” https://scapy.net. [69] A. Sperotto, R. Sadre, F. van Vliet, and A. Pras, “A labeled data set
[41] “Cisco Netflow,” https://www.cisco.com/c/en/us/products/ for flow-based intrusion detection,” in IP Operations and Management,
ios-nx-os-software/ios-netflow/index.html. ser. Lect. Notes in Comput. Sc. Springer, 2009, pp. 39–50.
[42] “Nfdump,” https://github.com/phaag/nfdump. [70] Center for Applied Internet Data Analysis, “Data Collection, Curation
[43] D. Jurafsky and J. H. Martin, Speech and Language Processing: An and Sharing,” https://www.caida.org/data/, 2018.
Introduction to Natural Language Processing, Computational Linguis- [71] G. Creech and J. Hu, “Generation of a new IDS test dataset: Time to
tics, and Speech Recognition, 1st ed. Prentice Hall PTR, 2000. retire the KDD collection,” in Proc. of WCNC), 2013, pp. 4487–4492.
[44] R. Hofstede, P. Čeleda, B. Trammell, I. Drago, R. Sadre, A. Sperotto, [72] R. Salakhutdinov and G. Hinton, “Deep boltzmann machines,” in Proc.
and A. Pras, “Flow monitoring explained: From packet capture to of AISTATS, 2009, pp. 448–455.
data analysis with NetFlow and IPFIX,” IEEE Commun. Surveys Tuts., [73] I. Goodfellow, Y. Bengio, and A. Courville, “Autoencoders,” in Deep
vol. 16, no. 4, pp. 2037–2064, 2014. Learning. MIT Press, 2016.
[45] M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, “A detailed [74] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based
analysis of the KDD CUP 99 data set,” in Proc. of CISDA, 2009. learning applied to document recognition,” Proc. of the IEEE, vol. 86,
[46] X. Jing, Z. Yan, and W. Pedrycz, “Security data collection and data no. 11, pp. 2278–2324, 1998.
analytics in the internet: A survey,” IEEE Commun. Surveys Tuts., [75] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural
vol. 21, no. 1, pp. 586–618, 2019. Computation, vol. 9, no. 8, pp. 1735–1780, 1997.
[47] J. Fonseca, M. Vieira, and H. Madeira, “Testing and comparing web [76] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley,
vulnerability scanning tools for SQL Injection and XSS attacks,” in S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,”
Proc. of PRDC), 2007, pp. 365–372. in Proc. of NIPS, 2014, p. 2672–2680.
[48] A. Lazarevic, V. Kumar, and J. Srivastava, “Intrusion detection: A [77] G. E. Hinton, “Training products of experts by minimizing contrastive
survey,” Managing Cyber Threats, vol. 5, pp. 19–78, 2005. divergence,” Neural Comput., vol. 14, no. 8, p. 1771–1800, 2002.
[49] University of the Aegean, “AWID2018 dataset,” http://icsdweb.aegean. [78] G. E. Hinton, S. Osindero, and Y.-W. Teh, “A fast learning algorithm
gr/awid/features.html, 2018. for deep belief nets,” Neural Comput., vol. 18, no. 7, 2006.
[50] Canadian Institute for Cybersecurity, “Intrusion Detection Evalua- [79] M. A. Salama, H. Eid, R. Ramadan, A. Darwish, and A. E. Hassanien,
tion Dataset (CICIDS2017),” https://www.unb.ca/cic/datasets/ids-2017. “Hybrid intelligent intrusion detection scheme,” Adv. Intell. Soft Com-
html, 2017. put., vol. 96, pp. 295–302, 2011.
[51] University of California, Irvine (UCI), “KDD Cup 1999,” http://www. [80] G. Zhao, C. Zhang, and L. Zheng, “Intrusion detection using Deep
kdd.org/kdd-cup/view/kdd-cup-1999, 1999. Belief Network and probabilistic neural network,” in Proc. of CSE,
[52] Kyoto University, “Traffic Data from Kyoto University’s Honeypots,” vol. 1, 2017, pp. 639–642.
http://www.takakura.com/Kyoto_data, 2015. [81] N. Gao, L. Gao, Q. Gao, and H. Wang, “An intrusion detection model
[53] N. Moustafa and J. Slay, “UNSW-NB15: a comprehensive data set for based on Deep Belief Networks,” in Proc. of CBD, 2014, pp. 247–252.
network intrusion detection systems,” in Proc. of MilCIS, 2015. [82] M. Z. Alom, V. Bontupalli, and T. M. Taha, “Intrusion detection using
[54] I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani, “Toward generating Deep Belief Networks,” in Proc. of NAECON, 2015, pp. 339–344.
a new intrusion detection dataset and intrusion traffic characterization,” [83] K. Alrawashdeh and C. Purdy, “Toward an online anomaly intrusion
in Proc. of ICISSP, 2018. detection system based on Deep Learning,” in Proc. of ICMLA, 2016,
[55] Massachussets Institute of Technology, “1998 DARPA Intrusion pp. 195–200.
Detection Evaluation Dataset,” https://www.ll.mit.edu/r-d/datasets/ [84] E. R. Merino, F. M. Castrillejo, J. D. Pin, and D. B. Prats, “Weighted
1998-darpa-intrusion-detection-evaluation-dataset, 1998. contrastive divergence,” CoRR, vol. abs/1801.02567, 2018.
[56] B. Sangster, T. J. O’Connor, T. Cook, R. Fanelli, E. Dean, W. J. Adams, [85] NVIDIA, “CUDA,” https://developer.nvidia.com/cuda-zone, 2020.
C. Morrell, and G. Conti, “Toward instrumenting network warfare [86] B. Abolhasanzadeh, “Nonlinear dimensionality reduction for intrusion
competitions to generate labeled datasets,” in Proc. of CSET, 2009. detection using Auto-Encoder bottleneck features,” in Proc. of IKT,
[57] Canadian Institute for Cybersecurity, “Intrusion Detection Evalu- 2015, pp. 1–5.
ation Dataset (ISCXIDS2012),” https://www.unb.ca/cic/datasets/ids. [87] M. Yousefi-Azar, V. Varadharajan, L. Hamey, and U. Tupakula,
html, 2012. “Autoencoder-based feature learning for cyber security applications,”
[58] A. Shiravi, H. Shiravi, M. Tavallaee, and A. A. Ghorbani, “Toward in Proc. of IJCNN, 2017, pp. 3854–3861.
developing a systematic approach to generate benchmark datasets for [88] V. L. Cao, M. Nicolau, and J. McDermott, “A Hybrid Autoencoder and
intrusion detection,” Comput. Secur., vol. 31, no. 3, pp. 357–374, 2012. density estimation model for anomaly detection,” in Proc. of PPSN,
[59] Canadian Institute for Cybersecurity, “DoS dataset (CIC DoS dataset 2016, pp. 717–726.
2017),” https://www.unb.ca/cic/datasets/dos-dataset.html, 2017. [89] B. Zong, Q. Song, M. R. Min, W. Cheng, C. Lumezanu, D. ki Cho,
[60] ALDAPA, “Gure-Kddcup dataset,” http://www.sc.ehu.es/acwaldap/ and H. Chen, “Deep Autoencoding Gaussian Mixture Model for
gureKddcup, 2008. unsupervised anomaly detection,” in Proc. of ICLR, 2018.
[90] A. Y. Javaid, Q. Niyaz, W. Sun, and M. Alam, “A Deep Learning
approach for network intrusion detection system,” in Proc. of BICT,
2015.
IEEE SYSTEMS JOURNAL 14

[91] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol, [117] S. Z. Lin, Y. Shi, and Z. Xue, “Character-level intrusion detection based
“Stacked denoising autoencoders: Learning useful representations in a on Convolutional Neural Networks,” in Proc. of IJCNN, 2018, pp. 1–8.
deep network with a local denoising criterion,” J. Mach. Learn. Res., [118] Y. Xiao, C. Xing, T. Zhang, and Z. Zhao, “An intrusion detection model
vol. 11, p. 3371–3408, 2010. based on feature reduction and Convolutional Neural Networks,” IEEE
[92] Y. Yu, J. Long, and Z. Cai, “Network intrusion detection through Access, vol. 7, pp. 42 210–42 219, 2019.
stacking dilated Convolutional Autoencoders,” Secur. Commun. Netw., [119] G. Feng, B. Li, M. Yang, and Z. Yan, “V-CNN: Data visualizing based
vol. 2017, pp. 1–10, 2017. Convolutional Neural Network,” in Proc. of ICSPCC, 2018, pp. 1–6.
[93] Q. Niyaz, W. Sun, and A. Y. Javaid, “A Deep Learning based DDoS [120] R. Ronen, M. Radu, C. Feuerstein, E. Yom-Tov, and M. Ahmadi, “Mi-
detection system in software-defined networking (sdn),” EAI Endorsed crosoft malware classification challenge,” CoRR, vol. abs/1802.10135,
Trans. on Security and Safety, vol. 4, no. 12, 2017. 2018.
[94] N. Shone, T. N. Ngoc, V. D. Phai, and Q. Shi, “A Deep Learning [121] S.-N. Nguyen, V.-Q. Nguyen, J. Choi, and K. Kim, “Design and
approach to network intrusion detection,” IEEE Trans. Emerg. Topics implementation of intrusion detection system using Convolutional
Comput. Intell., vol. 2, no. 1, pp. 41–50, 2018. Neural Network for DoS detection,” in Proc. of ICMLSC, 2018.
[95] F. Farahnakian and J. Heikkonen, “A Deep Auto-Encoder based ap- [122] S. Park, M. Kim, and S. Lee, “Anomaly detection for HTTP using
proach for intrusion detection system,” in Proc. of ICACT, 2018. Convolutional Autoencoders,” IEEE Access, vol. 6, 2018.
[96] L. R. Parker, P. D. Yoo, T. A. Asyhari, L. Chermak, Y. Jhi, and K. Taha, [123] R. Kruse, C. Borgelt, C. Braune, S. Mostaghim, M. Steinbrecher,
“DEMISe: Interpretable Deep extraction and mutual information selec- F. Klawonn, and C. Moewes, Computational Intelligence: A Method-
tion techniques for IoT intrusion detection,” in Proc. of ARES, 2019, ological Introduction, 2nd ed. Springer, 2016.
pp. 98:1–98:10. [124] A. Brown, A. Tuor, B. Hutchinson, and N. Nichols, “Recurrent neural
[97] D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” in network attention mechanisms for interpretable system log anomaly
Proc. of ICLR, 2014. detection,” in Proc. of MLCS, 2018, pp. 1–8.
[98] Q. P. Nguyen, K. W. Lim, D. M. Divakaran, K. H. Low, and M. C. [125] G. Kim, H. Yi, J. Lee, Y. Paek, and S. Yoon, “LSTM-based system-call
Chan, “GEE: A gradient-based explainable Variational Autoencoder for language modeling and robust ensemble method for designing host-
network anomaly detection,” in Proc. of CNS, 2019, pp. 91–99. based intrusion detection systems,” ArXiv, vol. abs/1611.01726, 2016.
[99] G. Maciá-Fernández, J. Camacho, R. Magán-Carrión, P. García- [126] F. Jiang, Y. Fu, B. B. Gupta, F. Lou, S. Rho, F. Meng, and Z. Tian,
Teodoro, and R. Therón, “UGR’16: A new dataset for the evaluation of “Deep Learning based multi-channel intelligent attack detection for
cyclostationarity-based network IDSs,” Computers & Security, vol. 73, data security,” IEEE Trans. Sustain. Comput., pp. 1–1, 2018.
pp. 411–424, 2018. [127] R. Vinayakumar, K. P. Soman, and P. Poornachandran, “Applying
[100] L. Vu, V. L. Cao, Q. U. Nguyen, D. N. Nguyen, D. T. Hoang, and Convolutional Neural Network for network intrusion detection,” in
E. Dutkiewicz, “Learning latent distribution for distinguishing network Proc. of ICACCI, 2017, pp. 1222–1228.
traffic in intrusion detection system,” in Proc. of ICC, 2019, pp. 1–6. [128] W. Wang, Y. Sheng, J. Wang, X. Zeng, X. Ye, Y. Huang, and
[101] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification M. Zhu, “HAST-IDS: Learning hierarchical spatial-temporal features
with Deep Convolutional Neural Networks,” Commun. ACM, vol. 60, using Deep Neural Networks to improve intrusion detection,” IEEE
no. 6, p. 84–90, 2017. Access, vol. 6, pp. 1792–1806, 2018.
[102] A. Genovese, V. Piuri, K. N. Plataniotis, and F. Scotti, “PalmNet: [129] Y. Zhang, X. Chen, L. Jin, X. Wang, and D. Guo, “Network intrusion
Gabor-PCA Convolutional Networks for touchless palmprint recogni- detection: Based on Deep Hierarchical Network and original flow data,”
tion,” IEEE Trans. Inf. Forensics Security, vol. 14, no. 2, 2019. IEEE Access, vol. 7, pp. 37 004–37 016, 2019.
[103] R. Donida Labati, A. Genovese, E. Muñoz, V. Piuri, and F. Scotti, [130] M. Elbayad, L. Besacier, and J. Verbeek, “Pervasive attention: 2D
“A novel pore extraction method for heterogeneous fingerprint images Convolutional Neural Networks for sequence-to-sequence prediction,”
using Convolutional Neural Networks,” Pattern Recognit. Lett., vol. CoRR, vol. abs/1808.03867, 2018.
113, no. 1, pp. 58–66, 2018. [131] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang,
[104] A. Genovese, V. Piuri, F. Scotti, and S. Vishwakarma, “Touchless T. Weyand, M. Andreetto, and H. Adam, “MobileNets: Efficient
palmprint and finger texture recognition: A Deep Learning fusion Convolutional Neural Networks for mobile vision applications,” CoRR,
approach,” in Proc. of CIVEMSA, 2019. vol. abs/1704.04861, 2017.
[105] R. Donida Labati, A. Genovese, E. Muñoz, V. Piuri, and F. Scotti, “Ap- [132] S. Bai, J. Z. Kolter, and V. Koltun, “An empirical evaluation of generic
plications of computational intelligence in industrial and environmental convolutional and recurrent networks for sequence modeling,” CoRR,
scenarios,” in Learning Systems: from Theory to Practice. Springer, vol. abs/1803.01271, 2018.
2018, vol. 756, pp. 29–46. [133] Y. Hong, U. Hwang, J. Yoo, and S. Yoon, “How generative adversarial
[106] Z. Li, Z. Qin, K. Huang, X. Yang, and S. Ye, “Intrusion detection using networks and their variants work: An overview,” ACM Comput. Surv.,
Convolutional Neural Networks for representation learning,” in Neural vol. 52, no. 1, pp. 10:1–10:43, 2019.
Information Processing. Springer, 2017, pp. 858–866. [134] A. Genovese, V. Piuri, and F. Scotti, “Towards explainable face aging
[107] “One-hot encoding,” https://www.sciencedirect.com/topics/ with Generative Adversarial Networks,” in Proc. of ICIP, 2019.
computer-science/one-hot-encoding, 2020. [135] T. Schlegl, P. Seeböck, S. M. Waldstein, U. Schmidt-Erfurth, and
[108] M. Kalash, M. Rochan, N. Mohammed, N. D. B. Bruce, Y. Wang, G. Langs, “Unsupervised anomaly detection with Generative Adversar-
and F. Iqbal, “Malware classification with Deep Convolutional Neural ial Networks to guide marker discovery,” CoRR, vol. abs/1703.05921,
Networks,” in Proc. of NTMS, 2018, pp. 1–5. 2017.
[109] T. Kim, S. C. Suh, H. Kim, J. Kim, and J. Kim, “An encoding technique [136] H. Zenati, C. S. Foo, B. Lecouat, G. Manek, and V. R. Chandrasekhar,
for CNN-based network anomaly detection,” in Proc. of Big Data, “Efficient GAN-based anomaly detection,” ArXiv, vol. abs/1802.06222,
2018, pp. 2960–2965. 2018.
[110] R. Blanco, P. Malagón, J. J. Cilla, and J. M. Moya, “Multiclass network [137] A. Gharib, I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani, “An
attack classifier using CNN tuned with genetic algorithms,” in Proc. of evaluation framework for intrusion detection dataset,” in Proc. of
PATMOS, 2018, pp. 177–182. ICISS), 2016, pp. 1–6.
[111] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image [138] A. Torralba and A. A. Efros, “Unbiased look at dataset bias,” in Proc.
recognition,” in Proc. of CVPR, 2016, pp. 770–778. of CVPR, 2011, pp. 1521–1528.
[112] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, [139] T. Tommasi, N. Patricia, B. Caputo, and T. Tuytelaars, “A deeper
D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with look at dataset bias,” in Domain Adaptation in Computer Vision
convolutions,” in Proc. of CVPR, 2015. Applications. Springer, 2017, pp. 37–55.
[113] K. Simonyan and A. Zisserman, “Very deep convolutional networks [140] J. McHugh, “Testing intrusion detection systems: a critique of the 1998
for large-scale image recognition,” in Proc. of ICLR, 2015. and 1999 DARPA intrusion detection system evaluations as performed
[114] K. Wu, Z. Chen, and W. Li, “A novel intrusion detection model for a by lincoln laboratory,” ACM Trans. Inf. Syst. Secur., vol. 3, pp. 262–
massive network using Convolutional Neural Networks,” IEEE Access, 294, 2000.
vol. 6, pp. 50 850–50 859, 2018. [141] J. G. Moreno-Torres, T. Raeder, R. Alaíz-Rodríguez, N. V. Chawla, and
[115] U. Çekmez, Z. Erdem, A. G. Yavuz, O. K. Sahingoz, and A. Buldu, F. Herrera, “A unifying view on dataset shift in classification,” Pattern
“Network anomaly detection with Deep Learning,” in Proc. of SIU, Recognit., vol. 45, pp. 521–530, 2012.
2018, pp. 1–4. [142] R. F. Fouladi, T. Seifpoor, and E. Anarim, “Frequency characteristics
[116] M. Ito and H. Iyatomi, “Web application firewall using character-level of DoS and DDoS attacks,” in Proc. of SIU, 2013, pp. 1–4.
Convolutional Neural Network,” in Proc. of CSPA, 2018, pp. 103–106.

Module 2 - ICT Policies and Issues: Implications To Teaching and Learning
89% (9)
Module 2 - ICT Policies and Issues: Implications To Teaching and Learning
6 pages
Machine Learning and Deep Learning Methods For Intrusion Detection Systems - Recent Developments and Challenges
No ratings yet
Machine Learning and Deep Learning Methods For Intrusion Detection Systems - Recent Developments and Challenges
33 pages
Intrusion Detection
No ratings yet
Intrusion Detection
76 pages
Machine Learning and Deep Learning Based Intrusion Detection in Cloud Environment A Review
No ratings yet
Machine Learning and Deep Learning Based Intrusion Detection in Cloud Environment A Review
9 pages
MFM 9013 Block Led Type Manual
No ratings yet
MFM 9013 Block Led Type Manual
31 pages
Enhancing Cybersecurity Through Machine Learning-Based Intrusion Detection Systems
No ratings yet
Enhancing Cybersecurity Through Machine Learning-Based Intrusion Detection Systems
26 pages
Intrusion Detection System A Comparative Study of
No ratings yet
Intrusion Detection System A Comparative Study of
31 pages
A Review On Cybersecurity Based On Machine Learnin
No ratings yet
A Review On Cybersecurity Based On Machine Learnin
12 pages
Research Paper DR Abubuckerand MR John 1
No ratings yet
Research Paper DR Abubuckerand MR John 1
6 pages
Network Intrusion Detection in Big Datasets Using Spark Environment and Incremental Learning
No ratings yet
Network Intrusion Detection in Big Datasets Using Spark Environment and Incremental Learning
8 pages
Warehouse Management Software (WMS) : Software Requirement Specification (SRS)
No ratings yet
Warehouse Management Software (WMS) : Software Requirement Specification (SRS)
94 pages
Article 2
No ratings yet
Article 2
16 pages
Corrected Intrusion Dection 1-3
No ratings yet
Corrected Intrusion Dection 1-3
51 pages
Intrusion Detection Model Using Machine Learning Algorithms On NSL-KDD Dataset
No ratings yet
Intrusion Detection Model Using Machine Learning Algorithms On NSL-KDD Dataset
14 pages
IDS Theisis
No ratings yet
IDS Theisis
30 pages
Machine Learning For Intrusion Detection in Cyber Security: Applications, Challenges, and Recommendations
No ratings yet
Machine Learning For Intrusion Detection in Cyber Security: Applications, Challenges, and Recommendations
24 pages
Cyber Intrusion Prediction and Taxonomy System Using Deep Learning and Distributed Big Data Processing
No ratings yet
Cyber Intrusion Prediction and Taxonomy System Using Deep Learning and Distributed Big Data Processing
8 pages
Sample
No ratings yet
Sample
14 pages
Protocols For Authentication and Key Establishment
No ratings yet
Protocols For Authentication and Key Establishment
542 pages
Machine Learning and Deep Learning Methods For Intrusion Detection Systems - A Survey
No ratings yet
Machine Learning and Deep Learning Methods For Intrusion Detection Systems - A Survey
29 pages
.Cyber Attack Detection and Notifying System Using ML Techniques
No ratings yet
.Cyber Attack Detection and Notifying System Using ML Techniques
7 pages
Paper 5
No ratings yet
Paper 5
22 pages
Artificial Intelligence Based Intrusion Detection
No ratings yet
Artificial Intelligence Based Intrusion Detection
10 pages
Intrusion Detection System Using Machine Learning
No ratings yet
Intrusion Detection System Using Machine Learning
4 pages
1 s20 S2542660521001037 Main
No ratings yet
1 s20 S2542660521001037 Main
18 pages
Machine Learning-Based Intrusion Detection System For Detecting Web Attacks
No ratings yet
Machine Learning-Based Intrusion Detection System For Detecting Web Attacks
11 pages
Performance Evaluation of Machine Learning Algorithms For Intrusion Detection System
No ratings yet
Performance Evaluation of Machine Learning Algorithms For Intrusion Detection System
20 pages
Table of Content
No ratings yet
Table of Content
55 pages
Comparative Analysis of Feature Selection Techniques For LSTM Based Network Intrusion Detection Models
No ratings yet
Comparative Analysis of Feature Selection Techniques For LSTM Based Network Intrusion Detection Models
11 pages
Sensors 23 02415
No ratings yet
Sensors 23 02415
18 pages
Yusuf
No ratings yet
Yusuf
16 pages
Elevating Cybersecurity Using AI and Deep Learning For Intrusion Detection Reinforcement
No ratings yet
Elevating Cybersecurity Using AI and Deep Learning For Intrusion Detection Reinforcement
27 pages
A Survey of Neural Networks Usage For Intrusion Detection Systems
No ratings yet
A Survey of Neural Networks Usage For Intrusion Detection Systems
18 pages
A Study of Network Intrusion Detection S
No ratings yet
A Study of Network Intrusion Detection S
27 pages
Machine Learning Technical Report
No ratings yet
Machine Learning Technical Report
12 pages
s42400 021 00103 8
No ratings yet
s42400 021 00103 8
22 pages
8499-Article Text-9477-2-10-20231102
No ratings yet
8499-Article Text-9477-2-10-20231102
12 pages
Splnproc1703 D
No ratings yet
Splnproc1703 D
12 pages
Deep Learning in Intrusion Detection Systems-2018
No ratings yet
Deep Learning in Intrusion Detection Systems-2018
4 pages
Transactions On Emerging Telecommunications Technologies 2020 Ahmad Network Intrusion Detection System A Systemati
No ratings yet
Transactions On Emerging Telecommunications Technologies 2020 Ahmad Network Intrusion Detection System A Systemati
29 pages
Applsci 13 07507 v4
No ratings yet
Applsci 13 07507 v4
34 pages
01-JCCE2202270 Online
No ratings yet
01-JCCE2202270 Online
10 pages
Machine Learning Based Network Intrusion Detection For Big and Imbalanced Data Using Oversampling, Stacking Feature Embedding and Feature Extraction
No ratings yet
Machine Learning Based Network Intrusion Detection For Big and Imbalanced Data Using Oversampling, Stacking Feature Embedding and Feature Extraction
44 pages
(FREE PDF Sample) Build A Frontend Web Framework From Scratch 1st Edition Ángel Sola Orbaiceta Ebooks
No ratings yet
(FREE PDF Sample) Build A Frontend Web Framework From Scratch 1st Edition Ángel Sola Orbaiceta Ebooks
67 pages
Lesson Plan - Internet of Things - Docx-1709033060448
No ratings yet
Lesson Plan - Internet of Things - Docx-1709033060448
6 pages
AI and ML Techniques For Intrusion Detection
No ratings yet
AI and ML Techniques For Intrusion Detection
6 pages
Symmetry 15 01251
No ratings yet
Symmetry 15 01251
31 pages
A Review On Machine Learning Methods For Intrusion
No ratings yet
A Review On Machine Learning Methods For Intrusion
8 pages
Intrusion Detectionin Cyber Security
No ratings yet
Intrusion Detectionin Cyber Security
11 pages
TSP JCS 46915
No ratings yet
TSP JCS 46915
23 pages
323-1851-310.r1.2 - OME6500 Provisioning and Operating
No ratings yet
323-1851-310.r1.2 - OME6500 Provisioning and Operating
340 pages
Performance Analysis of Machine Learning
No ratings yet
Performance Analysis of Machine Learning
22 pages
Project Paper Publication
No ratings yet
Project Paper Publication
10 pages
Paper 2
No ratings yet
Paper 2
11 pages
Elevating Cybersecurity Using AI and Deep Learning For Intrusion Detection Reinforcement
No ratings yet
Elevating Cybersecurity Using AI and Deep Learning For Intrusion Detection Reinforcement
9 pages
HDLNIDS Hybrid Deep-Learning
No ratings yet
HDLNIDS Hybrid Deep-Learning
17 pages
10.1515 - Eng 2022 0403
No ratings yet
10.1515 - Eng 2022 0403
11 pages
HCIA - DATACOM Single Choice
100% (3)
HCIA - DATACOM Single Choice
143 pages
Machine Learning Techniques For Network-1
No ratings yet
Machine Learning Techniques For Network-1
7 pages
631eaa91dbcfb7 78471842
No ratings yet
631eaa91dbcfb7 78471842
13 pages
Batch 1 - 4 CSE C
No ratings yet
Batch 1 - 4 CSE C
9 pages
Network Intrusion Detection and Prevention
No ratings yet
Network Intrusion Detection and Prevention
8 pages
DNS HowTo
No ratings yet
DNS HowTo
4 pages
Class-Viii Artificial Intelligence Ciricullum Plan 2024-25
No ratings yet
Class-Viii Artificial Intelligence Ciricullum Plan 2024-25
4 pages
Machine Learning and Deep Learning Methods For Cybersecurity
No ratings yet
Machine Learning and Deep Learning Methods For Cybersecurity
17 pages
Access Control Guide
No ratings yet
Access Control Guide
63 pages
Reference
No ratings yet
Reference
5 pages
Deep Learning Approach For Intelligent Intrusion Detection System
No ratings yet
Deep Learning Approach For Intelligent Intrusion Detection System
5 pages
Procedure For Arrangements of Gate4SPICE Events
No ratings yet
Procedure For Arrangements of Gate4SPICE Events
14 pages
Gujarat Technological University: Aditya Silver Oak Institute of Technology
No ratings yet
Gujarat Technological University: Aditya Silver Oak Institute of Technology
21 pages
PCCSA - Prepaway.premium - Exam.50q: Number: PCCSA Passing Score: 800 Time Limit: 120 Min File Version: 1.0
No ratings yet
PCCSA - Prepaway.premium - Exam.50q: Number: PCCSA Passing Score: 800 Time Limit: 120 Min File Version: 1.0
17 pages
2024-GOI-IES-FAQs Updated 07 March 2024
No ratings yet
2024-GOI-IES-FAQs Updated 07 March 2024
8 pages
Anomaly Based Intrusion Detection Model Using Supervised Machine Learning Techniques
No ratings yet
Anomaly Based Intrusion Detection Model Using Supervised Machine Learning Techniques
5 pages
Vrealize Operations 86 Api Guide
No ratings yet
Vrealize Operations 86 Api Guide
30 pages
73 - VLAN Service Management
No ratings yet
73 - VLAN Service Management
19 pages
Intrusion Detection and Prevention in Networks Using Machine Learning and Deep Learning Approaches A Review
No ratings yet
Intrusion Detection and Prevention in Networks Using Machine Learning and Deep Learning Approaches A Review
4 pages
Top 40 CCNP Enterprise Interview Question and Answer
No ratings yet
Top 40 CCNP Enterprise Interview Question and Answer
6 pages
Aeroflex 3500A Maintenance Manual Looking or Who Can Provide CAL PASSWORD - Page 1
No ratings yet
Aeroflex 3500A Maintenance Manual Looking or Who Can Provide CAL PASSWORD - Page 1
7 pages
04 Slide Layouts L1
No ratings yet
04 Slide Layouts L1
11 pages
ISABELLA E MARIANA - YETERIAN - G&V - Profile
No ratings yet
ISABELLA E MARIANA - YETERIAN - G&V - Profile
3 pages
Instalation and Guidlines For Evonybot
No ratings yet
Instalation and Guidlines For Evonybot
7 pages
Wikipedia - About - Wikipedia
No ratings yet
Wikipedia - About - Wikipedia
3 pages
English 6 DLL Quarter 1 Week 5
No ratings yet
English 6 DLL Quarter 1 Week 5
5 pages
Confirmation of Candidature Presentation Slides - Bryce Thomas - James Cook University
No ratings yet
Confirmation of Candidature Presentation Slides - Bryce Thomas - James Cook University
82 pages
Penawaran LagoonAvenue - Aiscomm
No ratings yet
Penawaran LagoonAvenue - Aiscomm
3 pages
Installation Guide - EN
No ratings yet
Installation Guide - EN
6 pages
Cisco Cloud Email Security: At-a-Glance
No ratings yet
Cisco Cloud Email Security: At-a-Glance
4 pages
How To Find Table Names in SAP
94% (17)
How To Find Table Names in SAP
11 pages
Escalation Points
No ratings yet
Escalation Points
2 pages
Advanced Network Defense: Architectures and Best Practices for Today’s Perimeter
From Everand
Advanced Network Defense: Architectures and Best Practices for Today’s Perimeter
Lawrence Bennier
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Sysj 20

Uploaded by

Sysj 20

Uploaded by

IEEE SYSTEMS JOURNAL 1

A Comprehensive Survey of Databases and

C YBERSECURITY systems have been of great impor-

Fig. 1. Example of a Deep Belief Network (DBN). DBNs are obtained by

dimensionality reduction phase. Similar to that in [86], the

method proposed in [106] converts feature attributes into

to perform compared with cases in which a GAN is used to

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.