0% found this document useful (0 votes)
162 views15 pages

Deep Learning Algorithms For Cybersecurity

Uploaded by

Tejas mote
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
162 views15 pages

Deep Learning Algorithms For Cybersecurity

Uploaded by

Tejas mote
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Computer Science Review 39 (2021) 100317

Contents lists available at ScienceDirect

Computer Science Review


journal homepage: www.elsevier.com/locate/cosrev

Review article

Deep Learning Algorithms for Cybersecurity Applications: A


Technological and Status Review

Priyanka Dixit , Sanjay Silakari
Department of Computer Science & Engg, University Institute of Technology, Rajiv Gandhi Proudyogiki Vishwavidyalaya Bhopal (M.P), India

article info a b s t r a c t

Article history: Cybersecurity mainly prevents the hardware, software, and data present in the system that has an
Received 23 May 2020 active internet connection from external attacks. Organizations mainly deploy cybersecurity for their
Accepted 6 November 2020 databases and systems to prevent it from unauthorized access. Different forms of attacks like phishing,
Available online xxxx
spear-phishing, a drive-by attack, a password attack, denial of service, etc. are responsible for these
Keywords: security problems In this survey, we analyzed and reviewed the usage of deep learning algorithms
Cybersecurity for Cybersecurity applications. Deep learning which is also known as Deep Neural Networks includes
Deep learning machine learning techniques that enable the network to learn from unsupervised data and solve
Attack complex problems. Here, 80 papers from 2014 to 2019 have been used and successfully analyzed.
Supervised and unsupervised Deep learning approaches such as Convolutional Neural Network (CNN), Auto Encoder (AE), Deep
Belief Network (DBN), Recurrent Neural Network (RNN), Generative Adversal Network (GAN) and Deep
Reinforcement Learning (DIL) are used to categorize the papers referred. Each specific technique is
effectively discussed with its algorithms, platforms, dataset, and potential benefits. The paper related
to deep learning with cybersecurity is mainly published in the year 2018 in a large number and 18%
of published articles originate from the UK. In addition, the papers are selected from a variety of
journals, and 30% of papers used are from the Elsevier journal. From the experimental analysis, it is
clear that the deep learning model improved the accuracy, scalability, reliability, and performance of
the cybersecurity applications when applied in realtime.
© 2020 Elsevier Inc. All rights reserved.

Contents

1. Introduction......................................................................................................................................................................................................................... 2
2. Cybersecurity attacks ......................................................................................................................................................................................................... 3
2.1. Type of attackers .................................................................................................................................................................................................. 3
2.2. Adversaries goal .................................................................................................................................................................................................... 3
2.3. Types of cybersecurity attacks ............................................................................................................................................................................. 3
3. Deep learning and its classification trend of cybersecurity .......................................................................................................................................... 4
3.1. Convolutional Neural Network (CNN) ................................................................................................................................................................. 4
3.1.1. Single CNN .............................................................................................................................................................................................. 4
3.1.2. Multi-CNN ............................................................................................................................................................................................... 4
3.1.3. Variants of CNN ...................................................................................................................................................................................... 4
3.1.4. Acoustic model of CNN.......................................................................................................................................................................... 4
3.1.5. Limited weight sharing of CNN ............................................................................................................................................................ 4
3.1.6. Cybersecurity applications using CNN ................................................................................................................................................. 4
3.2. Autoencoder (AE) ................................................................................................................................................................................................... 4
3.2.1. Stacked Auto Encoder (SAE) ................................................................................................................................................................. 6
3.2.2. Denoising Auto Encoder (DAE) ............................................................................................................................................................. 6
3.2.3. Variational Auto Encoder (VAE) ........................................................................................................................................................... 6
3.3. Deep Belief Network (DBN) .................................................................................................................................................................................. 6
3.3.1. Deep Boltzmann Machine (DBM) ......................................................................................................................................................... 6
3.3.2. Restricted Boltzmann Machine (RBM) ................................................................................................................................................. 6

∗ Corresponding author.
E-mail addresses: dixitpriyanka384@gmail.com, priyankadxt048@gmail.com (P. Dixit), ssilakari@yahoo.com (S. Silakari).

https://doi.org/10.1016/j.cosrev.2020.100317
1574-0137/© 2020 Elsevier Inc. All rights reserved.
P. Dixit and S. Silakari Computer Science Review 39 (2021) 100317

3.3.3. Deep Restricted Boltzmann Machine (DRBM) .................................................................................................................................... 6


3.4. Recurrent Neural Network (RNN) ........................................................................................................................................................................ 6
3.4.1. Bidirectional RNN (BRNN) ..................................................................................................................................................................... 6
3.4.2. Long Short Term Memory (LSTM)........................................................................................................................................................ 7
3.4.3. Acoustic model of RNN (ACNN) ........................................................................................................................................................... 7
3.4.4. Gated recurrent unit .............................................................................................................................................................................. 7
3.5. Generative Adversal Network (GAN)................................................................................................................................................................... 7
3.6. Deep Reinforcement Learning (RIL)..................................................................................................................................................................... 7
3.6.1. Multi-task reinforcement (MTR)........................................................................................................................................................... 7
3.6.2. Multi-agent reinforcement (MAR)........................................................................................................................................................ 7
3.6.3. Asynchronous reinforcement (AR) ....................................................................................................................................................... 8
3.6.4. Q-learning Reinforcement (QR) ............................................................................................................................................................ 8
4. Analysis and discussion ..................................................................................................................................................................................................... 8
4.1. Performance analysis of cybersecurity attack detection papers ...................................................................................................................... 8
4.2. Comparative analysis............................................................................................................................................................................................. 11
5. Open issues and future research directions .................................................................................................................................................................... 12
6. Conclusion ........................................................................................................................................................................................................................... 13
Declaration of competing interest.................................................................................................................................................................................... 13
References ........................................................................................................................................................................................................................... 13

1. Introduction

Nowadays, cyberspace development is increasing rapidly be-


cause of cloud computing [1], big data [2], Internet of Things,
and software-based network growth. One of the common prob-
lems in cyberspace is cybersecurity. Cybersecurity is a means
of safeguarding the systems, applications, and networks from
potential digital attacks. The main aim of the adversaries which
conducts these attacks is to modify/access the confidential in-
formation, laundering money from the users, and interrupting
the normal business operations. The challenges associated with
implementing the cybersecurity policies on organizations are the
large number of devices connected to the network and the novel
attacks conducted by hackers. The different kinds of attacks are
prevented by using tools like the intrusion detection system,
firewalls, scanner, and antivirus software, etc. The devices con-
nected to the network are often subjected to various attacks.
Fig. 1. Structure of cybersecurity attack detection survey.
The internet offers interconnection between networks as well as
supports hardware, intelligence, software, information, and data
to be exchanged between each other. Hence, computer networks
are very vulnerable to malware or other cybersecurity attacks. to detect the cybersecurity issues. Deep learning is one of the
The attackers are experienced to trace out the data from powerful machine learning techniques powered by AI and this
cyberspace [3]. The huge volume of data and confidential infor- research focuses on the same. The deep learning techniques can
mation is shielded with cybersecurity and if any attacks happen process a vast amount of information present in the cybersecurity
they automatically alert the whole organization about the same. datasets efficiently by withstanding the attacks [11]. Hence, many
Moreover, the anomalous detection characteristics, event corre- of the researchers focused on cybersecurity issues with deep
lation, and pattern identification are classified using data science learning concepts [12].
concepts applied to cybersecurity. The mobile devices cannot These researchers [13–20] proposed an elaborate survey of
be protected by the Intrusion Detection System(IDS) because existing cybersecurity applications utilizing deep learning tech-
of the limited battery power, mobility, and energy consump- niques. These researches were mainly conducted to motivate
tion characteristics. A protective shield can be built to safeguard various researchers pursuing their research in the same field
the applications using cybersecurity with the help of machine to upgrade the security of different organizations vulnerable to
learning algorithms [4–8]. The modern computer system adds various potential attacks. However, these articles did not cover
additional computational complexity when processing a huge the broad area of cybersecurity datasets used and the weaknesses
amount of information and while offering security. present in these deep learning techniques. Therefore, the basic
This challenge can be overcome by incorporating techniques of objective of this work is to introduce a bibliometric analysis of the
Artificial Intelligence(AI) [9]. The rapid development of computer- deep learning approach used for the detection of potential threats
based research, methods, and applications to replicate human to cybersecurity. Effectively, we have chosen the research papers
intelligence is called artificial intelligence (AI). The AI techniques from the year 2011 to 2020, which are based on cybersecurity
can easily identify the malware present in the application and can issues with deep learning concepts. Ultimately, we analyzed 80
take robust actions. It is also used to process the vast amount of research papers from different kinds of journals and the deeply
information the users generate on a daily basis. Machine learning analyzed survey are effectively mentioned in the below sec-
(ML) with more amounts of security detection software, encod- tion. Therefore, the outline structure of deep learning based on
ing, and thread extraction characteristics are required to identify cybersecurity attack detection is described in Fig. 1.
these attacks [10]. But, the deep learning concept is more efficient The contribution of this review article is explained as follows:
2
P. Dixit and S. Silakari Computer Science Review 39 (2021) 100317

i. We identify the different cybersecurity attacks namely de- 2.3. Types of cybersecurity attacks
nial of service, probe, malware, zero-day, phishing, sink-
hole, and user root attacks, and how deep learning models
Denial of service attack (DoS): It is conducted by sending a large
solve these attacks.
amount of traffic to the intended recipient, such that they will be
ii. Next, the different variants of deep neural network models no longer allowed to access the service from the corresponding
are analyzed and their functionalities are specified. The PC. The main intention of this attack is to freeze or stop the
different types of neural networks studied are Convolu- service permanently or temporarily [21].
tional Neural Network, Autoencoder, Deep Belief Network,
Recurrent Neural Network, Generative Adversal Network, Remote to local attack: The attackers take advantage of the system
and Deep Reinforcement Learning. using the network connection and conducts attacks by means of
iii. A comparative analysis is conducted to review the different the vulnerabilities(bugs) already existing in the system. In the
attacks encountered, the diverse platform used, datasets, local attack, the attackers carry out unauthorized access to the
and learning models of various researchers in the field of system by using their already existing account to the system. The
remote attacks are easier to prevent, but the local attacks are
cybersecurity using Deep Learning.
hard to detect. The attackers send a packet between the network
iv. This survey also provides the challenges faced by existing
and machines during remote to local attacks. The vulnerability of
research and open issues
machines is exploited [22].
The rest of this paper is organized as: The cybersecurity at-
Probing: The networks are scanned by the attackers and they
tacks are formulated in Section 2. In Section 3, the deep learning- easily collect the information and data [23]. The machines and
based cybersecurity attacks are discussed. Moreover, the current their services are mapped by the attackers.
trend discussion and analysis are carried out in Section 4 as well
as the challengeable open issues, and future research directions User to root attack: The system and normal user account are easily
are formulated in Section 5. At last, the paper is summarized in traced by the attackers. Especially, the passwords are traced and
Section 6. the user data may be lost [24].
Adversarial Attacks: Adversarial attacks raise questions about
2. Cybersecurity attacks whether deep learning is suitable for privacy concerning ap-
plications. Xu et al. [25] presented an approach that considers
the safety of text, image, and graphs used in the Deep Neural
The cybersecurity system is affected by different kinds of
Network model. Because the bank often takes ID proof utilizing
attacks such as a denial of service, probe, malware, zero-day,
the photograph to check whether the customer is an authenti-
phishing, sinkhole, user root, adversarial attacks, poisoning at-
cated person or not. If a bank offers a loan to an unauthorized
tack, evasive attack, Integrity attack, and causative attack. Most
person then it suffers a huge loss. Thus, the safety measures are
of the researchers have used deep learning concepts for the
critical in Deep Neural Networks. The adversary can hack the
detection of these attacks. In this survey, we analyzed different
Deep Neural Network powered system by giving false inputs and
papers related to cybersecurity attack detection with the help of causing the model to make misclassification. In the adversarial
deep learning concept and few of the attacks are discussed below. attacks, the attackers often insert perturbation similar to the
It also addresses the type of attackers and their goals. training input used and these attacks are often white-box attacks.
The protection mechanisms used against the white box attacks
2.1. Type of attackers often have limited success. Katzir and Elovici [26] presented an
approach to overcome this attack with defensive distillation and
The attacker’s knowledge can be classified into three types. In targeted gradient sign method and they also analyzed the prob-
the black box attack, the attacker does not know anything about lems associated with these methods. p-tampering is an attack
the deep learning model and they have zero knowledge about conducted against learning algorithm by integrating a dreadful
the model. In the gray box model, the attackers know the details malicious noise in it. Here the attacker modifies the training data
which has a probability p, but he is allowed to only select the
of some of the components present in the model and they have
adversarial samples with accurate labels. Saeed Mahloujifar et al.
moderate knowledge about the model. In the white box model,
[27] used the bias attack mechanism to overcome the problem by
the attacker has complete knowledge about this model and this
increasing the value of the real-valued function.
scenario happens only in the worst case.
Poisoning and Evasion attacks: In the training phase of deep learn-
ing poisoning attacks are conducted. To decrease the prediction
2.2. Adversaries goal
accuracy of the deep learning algorithm, the adversary inserts
the virus into the training samples. Evasion attack is mainly
The adversaries’ goal differs by means of the following classi- targeted towards the prediction process of deep learning. Here
fication. If an adversary conducts a targeted attack on the neural a wrong input is given to the neural network by the adversary
network that leads to misclassification means then it is known as to yield a wrong classification result. In both attacks, the attacker
integrity violation. If the adversary targets the system availability can control the input data. Jiang et al. [28] used Particle Swarm
and makes it unavailable for a certain period of time means Optimization(PSO) algorithm to overcome the poison and evasion
it is known as availability violation. If the adversary tries to attacks by focusing on the training phase in case of poison attacks
compromise the confidential information then it is known as a and interference phase in case of evasion attacks. They observed a
privacy violation. The attacker conducts the attack mainly in two classification accuracy decline from 95% to 33% for poison attacks
ways where one is a targeted attack and another one is a random and a 93% to 22% decline for evasion attack when injected with
attack. In the targeted attack, the adversary targets only a specific malware samples. The Convolutional Neural Network which takes
part of the training sample to yield an incorrect output. In the input in the form of the image is often subjected to evasion
random attack, the attacker focuses on any part of the training attacks. The attacker conducts these attacks by injecting pixels
sample and the goal is to misclassify the output result. level perturbations inside the image. The problem widely occurs
3
P. Dixit and S. Silakari Computer Science Review 39 (2021) 100317

in License plate recognition where the perturbations are inserted 3.1.2. Multi-CNN
to alter the number of the license plate. The attackers often The deep features are analyzed with multi-CNN, here more
modify the images in a way in which it cant be seen by the than one CNN is used. It is mainly used to amalgamate the
naked eye. Yaguan Qian et al. (2020) took this problem as an features effectively. The deep features are extracted from the
intricate optimization problem that needs to be solved and they various input and different environmental conditions. Each CNN
applied a genetic algorithm for this. The adversarial perturbation is trained by extra CNN also the performance is enhanced [35].
is identified by this algorithm. The authors indicate that this The feature extraction from different region and aspects are the
problem needs further exploration since it is a complex problem basic operation of multi-CNN. Therefore, data computing and
to be solved. collection requires important efforts. The features of the different
malware can be extracted using this technique.
Integrity Attacks: Integrity attacks are mainly focused on alter-
ing or corrupting the data that resides on the system. The at-
3.1.3. Variants of CNN
tacker mainly conducts this attack by encrypting the organi-
The different kinds of variants used in CNN to enhance the
zation’s important details by asking a huge theft of money to
performance of the system is derived as follows. Therefore, the
decrypt it. Guangyu et al. (2017) presented an optimal switching
computational load and parameters are reduced by using down-
data integrity attack. Here an attacker compromises a problem-
sample [36]. Many CNN layouts are designed and modified using
free system by partially inserting the malware into the selected
kernel function.
components and continuously conducting a targeted attack by
switching in between different systems. The authors aimed to
3.1.4. Acoustic model of CNN
find an optimal attack sequence of the compromised actuator by
The shared weight, receptive field, and spatial sampling data
using limited energy to improve the quadratic performance.
are combined and used for acoustic CNN. The additional joining
Causative Attacks: The causative attack is mainly conducted by of max pooling and convolutional layers are also used in the new
targeting the decision-making algorithm to yield an inaccurate environment to adapt to the tasks easily [37]. The time axis is
classification of the neural network. This shows that most of the implemented for data convolution with no validations. The local
estimation algorithms are prone to causative attack. To overcome correlations are captured in the presence of weight sharing also
this problem Sihag and Tajer [29] proposed a secure parameter the equivalent variances of CNN are collected.
estimation algorithm that can detect the attack and isolate the
neural network model. The summary of the cybersecurity attacks 3.1.5. Limited weight sharing of CNN
held is provided in Table 1. The performance of CNN is improved by adding a limited
amount of weights during the pre-training process [38]. The simi-
3. Deep learning and its classification trend of cybersecurity lar feature mapping with weight sharing is performed among the
neurons. During weight sharing procedure, similar pooling layers
The most important subsection of machine learning is the are connected with the conventional layer [39]. The features are
deep learning technique. The classification of deep learning based calculated using neurons as well as the amount off parameters is
on the cybersecurity attacks is shown below. The classifications increased rapidly. The better initial value is obtained during the
of deep learning are portrayed in Fig. 2 and its subsections are weight training procedure and the pooling layer can sub sampled
discussed in the below sections. the learned values.

3.1. Convolutional Neural Network (CNN) 3.1.6. Cybersecurity applications using CNN
Zhang et al. [40] presented a CNN based model for network
The feed-forward neural network of CNN consists of con- intrusion detection which identifies the malicious activities that
volutional, multiple hidden layers, pooling, and fully connected take place on the internet. Here, they are using a new Class Im-
layers respectively. The elements are represented using neurons balance Processing Technology(SGM-CNN) for large scale datasets
and the array can store all the inputs. The arrays are two and which consist of both undersampling(Gaussian Mixture Model)
three dimensional in nature. Generally, the fundamental element and oversampling(Synthetic Minority) schemes. The intrusion de-
of CNN is a convolutional layer [31]. The original input is the tection datasets usually suffer from an unbalanced class problem,
convolutional kernel which is the representation of weight also that offers a low detection rate. The CIPT technique improves
the receptive field is a smaller window [32]. The feature map is the classification accuracy of minority classes and provides a
obtained from the input calculation. The basic structure of the detection rate of 99.85%. Xiao et al. [41] proposed a malware
convolutional neural network is illustrated in Fig. 3. The CNN classification framework to classify the different types of malware
based cybersecurity attack detection is divided into single CNN, and their malicious intent. It first visually analyze the malware
Multi-CNN, Variants of CNN, Acoustic model of CNN, and Limited and then the classifier extracts the features for classification.
Weight Sharing of CNN. The malware present in binary form is visualized using entropy
graphs and the Deep CNN(DCNN)s used for feature extraction. The
3.1.1. Single CNN accuracy of this approach is 0.997 for the Malimg dataset and 1
The deep security attack detection method adopts one sin- for the Microsoft dataset.
gle CNN and the supervised training process is performed [33].
During security image recognition, the three-dimensional align- 3.2. Autoencoder (AE)
ments are carried out in the 9th layer of CNN. There is no
other weight sharing is proceeded in the presence of many local One or more hidden layers with input and output layers are
connected CNN layer [34]. The effective training set is predicted connected to the Autoencoder(AE). The AE consists of a similar
from the bootstrapping procedure of the web-scale. The perfor- amount of input and output as well as the data transmission
mance of cybersecurity attack detection is improved using a few is carried out by a smaller path. The transfer and unsupervised
schemes such as discriminative feature learning, feature fusion, learning issues are solved using a neural network. Therefore,
novel learning algorithms, loss function designing, and accepting task discovery and analysis are performed with the usage of
exact activation. autoencoders based on their characteristics [42]. The encoder and
4
P. Dixit and S. Silakari Computer Science Review 39 (2021) 100317

Table 1
Summary of cybersecurity attacks.
Author Name and Name of the The intention of the attack Demerits Type of Neural
Year attack Network/technique
applied
Sihag and Tajer [29] Causative Attack To alter the decision of the Computational Complexity, Decision-Making
algorithm Attack Complexity Algorithm
Guangyu et al. (2017) Integrity Attack To insert false information Cyber–physical
to partial actuators system
Yaguan Qian et al. Spot Evasion To generate malicious Serves as a big threat for Convolutional Neural
(2020) Attack images resembling the the License Plate Network
original images Identification system
Jiang et al. [28] Poison and To reduce the classification These attacks significantly Deep Neural network,
Evasion attack accuracy by injecting decrease the classification PSO
malware accuracy
Mahloujifar et al. [27] Poisoning attack To increase the error It does not apply to strong Probably
probability p-budget attacks approximately correct
learning
Katzir and Elovici [30] Adversarial attack To compute the networks The complex tradeoff for Neural Network
loss gradient future research and
increased cost
Xu et al. [25] Adversarial attack Attacker targets the deep Safety-Critical applications Deep Neural network
learning model to make are widely affected by these
mistakes attacks

Fig. 2. Classification of deep learning based on cybersecurity.

Fig. 3. Structure of convolutional neural network.

decoder are the fundamental components of the autoencoder. the training procedure of autoencoder. The basic architecture
Hence, the inputs are received by the encoder and fed to the novel
of autoencoder is depicted in Fig. 4 as well as the minimum
model (i.e. Latent or Code). The encoder distributes a code to
the generator and the reconstructed errors are reduced through input and output differences are shown effectively. The stacked,
5
P. Dixit and S. Silakari Computer Science Review 39 (2021) 100317

Fig. 4. Organization of Auto Encoder.

denoising and variational autoencoder are the few kinds of the 3.3.1. Deep Boltzmann Machine (DBM)
autoencoder. The stacked layer of restricted Boltzmann machine with
graphical, unsupervised, generative, and probabilistic representa-
tion is used in DBM. The data latent features are detected with
3.2.1. Stacked Auto Encoder (SAE)
the usage of DBM and it contains an undirectional connection
The stacked encoder is one of the unsupervised learning of [49]. The greedy layer is used to perform an effective learning
autoencoders. The input, output, and hidden layer are the three and parametric inference procedure. The optimal parameters are
basic layers of the stacked autoencoder [43]. The representation detected using unsupervised representation.
of the hidden layer is used to map and reconstruct the input
data. In each turn, the training process is executed via a greedy 3.3.2. Restricted Boltzmann Machine (RBM)
layer selection method. The final feature from the last layer is The most popular type of deep belief neural network is the
to perform the supervised learning procedure. The forward order restricted Boltzmann machine. The stochastic binary unit and
running is to execute the autoencoder process as well as the edges are the part of a stochastic neural network [50]. Hence,
reverse order is to execute the decoder process. Thus, the higher- the scalability and impractical issues are aroused during the
order feature delivers vector input. Ultimately, the softmax is Boltzmann machine learning process. The neurons from the RBM
used to solve the classification issues effectively. creates a bipartite chart with each visible hidden layers are con-
nected together [30]. A similar layer connection is restricted
with the process are executed using an effective algorithm. The
3.2.2. Denoising Auto Encoder (DAE)
dimensionality minimization and feature learning are performed
Similar input and output data are used for the autoencoder via RBM.
process [44]. The noise has been added to the training procedure
and it produces denoising power. The denoising autoencoder can 3.3.3. Deep Restricted Boltzmann Machine (DRBM)
train all convolutional kernels also the auto coder is trained. In The improved power with deep belief and Boltzmann net-
classification, the optimal feature and denoise input are obtained. work are the combinations of deep restricted Boltzmann ma-
chine [51]. The higher quality features are extracted using DRBM.
3.2.3. Variational Auto Encoder (VAE) Each layer is strictly restricted by DRBM. The combination of the
deep and restricted network can organize each layer [52]. During
The important generative representation of variational en-
the training process, each data with gradient and free energy
coder was introduced in the year of 2013. The backpropagation
computations are carried out.
has been lead to semi-supervised learning and quick training pro-
cess [45]. It is suitable for the application of IoT device security,
3.4. Recurrent Neural Network (RNN)
sensor failure detection, and the security of the intrusion systems
respectively [46]. The probabilistic model of learning is perfectly The one or more feedback connection is associated with the
implemented using a variational autoencoder. The visualization, recurrent neural network and it functions as a loop activation.
recognition, representation, and denoising task are executed by The sequence learning and temporal procedures are performed
the variational autoencoder. via the enabled network [53]. The additional loop with multi-
layer Perceptron is a basic of RNN and it consists of smaller
3.3. Deep Belief Network (DBN) memory. The activation of stochastic function with neuron is
connected potentially [54]. The same gradient descent function is
used to execute learning, activation, and architectural functions.
The stochastic and hidden layer is a basic component of
The recurrent features are collected through annealing concept.
the deep belief networks. The stochastic variables with a direct The recurrent neural network structure is depicted in Fig. 6.
acyclic graph are used to implement DBN [47]. The generative
and discriminative DBN are worked under the principle of greedy 3.4.1. Bidirectional RNN (BRNN)
selection. The unobserved variables and learning problems are The ordinary RNN limitations are improved using bidirectional
important issues in DBN [48]. During posterior distribution with RNN [55]. At a specified time with past and future input is used
the training, the procedure is complex. Hence, the states of 0 and to train the BRNN. The structure is divided into forwarding and
1 are used as binary units for the stochastic model. The structure backward states. The backward state does not have the input
of the deep belief network is shown in Fig. 5 and its classifications connection of forwarding state output [56]. A similar network
are explained in the below section. takes time and the objective functions are minimized effectively.
6
P. Dixit and S. Silakari Computer Science Review 39 (2021) 100317

Fig. 5. Structure of deep belief network.

Fig. 6. Structure of recurrent neural network.

Future information did not require any delays. The complica- 3.5. Generative Adversal Network (GAN)
tions of forwarding and backward connections are fed through
backpropagation. The generator and discriminator is a major model of GAN.
The tasks are determined by means of the discriminator model
as well as the accurate output is produced using discriminator
3.4.2. Long Short Term Memory (LSTM) [37,60]. The real and artificial input and outputs are correctly
The elaboration factor of RNN is represented as long short recognized using GAN. The high quality and synthetic data are
term memory. The input of 0 and 1 are used for every computing created [61]. The new data obtained from the former network.
gate units [57]. The feedback information is saved with the help The real and fake data in a latent network is clarified by the
of LSTM and all the LSTM consist of several neurons. Here, the discrimination process. Hence, the minimax theory is the major
write, read, and forget are the memory gates [39]. The memory objective function of GAN [62]. The discriminator and generator
cells are controlled using these gates. The training stage with operations of GAN are expressed in Fig. 7.
backpropagation is affected by the issue of vanishing.
3.6. Deep Reinforcement Learning (RIL)
3.4.3. Acoustic model of RNN (ACNN) In the year of 2015, deep mind reinforcement learning was
The acoustic model consists of lengthy vector data and tem- introduced by Mnih et al. [22]. The cumulative rewards are max-
poral variability. Thus, the long term dependency data are not imized also it is the basic idea of the machine learning function.
captured by CNN [58]. The vanishing issues easily affect these The high dimensional input data sizes are minimized by means
networks. The exponential error and decay function are carried of deep neural networks [63]. The values of Q- functions are im-
out because of backpropagation. The unacceptable arrangement plemented using multi-layer Perceptron. The deep reinforcement
of weights is adopted easily. The dimensionality minimization learning procedure is described in Fig. 8.
and feature learning are extracted quickly.
3.6.1. Multi-task reinforcement (MTR)
The powerful characteristic of reinforcement requires a time
3.4.4. Gated recurrent unit
of learning and the number of trajectories respectively [48]. The
The long term dependency during the RNN training process performances are notably improved using the multi-task rein-
is predicted using the gated recurrent unit (GRU). The hidden forcement learning method [64].
layer with the state to state transmission is modified besides a
few variations in LSTM are to produce GRU [59]. Therefore, the 3.6.2. Multi-agent reinforcement (MAR)
representation of simplicity and popularity is the major reason. The multiple agent reinforcement is operated with the amal-
Hence, few parameters are required in GRU and the network gamation of optimal rewards. Hence, the arrays of multiple tasks
operations are performed easily. are implemented quickly using MAR [65].
7
P. Dixit and S. Silakari Computer Science Review 39 (2021) 100317

Fig. 7. Structure of generative adversal network.

Fig. 8. Organization of deep reinforcement learning network.

3.6.3. Asynchronous reinforcement (AR) learning in cybersecurity attack detection papers are discussed
The longer time period of deep reinforcement learning is an effectively. Therefore, the general acronyms of this survey are
important problem [66]. So the asynchronous architecture is used listed in Table 2.
to enhance the performance of deep reinforcement learning as
well as the major parameters of servers are central and learning 4.1. Performance analysis of cybersecurity attack detection papers
process [67].
Kravchik and AsafShabtai [69] used 1Dimensional CNN to ana-
3.6.4. Q-learning Reinforcement (QR) lyze the different types of cyberattacks held in Industrial Control
Here, the temporal difference (TD) learning is the chief learn- Systems. Out of 36 cyber attacks present in the dataset, their
ing set of the Q-learning process also the entire environmental proposed method successfully identified 31 types of cyber-attacks
knowledge is never needed [24]. In every state, the optimal action using convolutional and recurrent neural networks. They used
solution is captured by using a traversed state [68]. The factor of Q the Secure Water Treatment(SWaT) dataset and implemented it
is updated during the incremental process of the neural network under the GoogleTensorflow framework platform. However, this
also the memory space covered the previous-state values. technique suffers from low interpretability and fails to identify
the other five types of cyberattacks present. Mimura and Tanaka
4. Analysis and discussion [70] created a generic detection approach that is free from any
attack methods and feature vector. The main intention of this
The cybersecurity system is used to detect different kinds of technique is to identify the adversaries’ communication from
attack based on deep learning method, which is discussed in the proxy server logs. It identifies two attacks namely command
this survey. The cybersecurity issues are occurred in everywhere and control traffic and unfamiliar drive-by download attack with
for example mails, computer systems, vehicles, entertainments, an F-measure value of 0.98 and 0.99. Vinayakumar et al. [71]
banks, companies, financial institutions, online data storage, etc... categorized the cyber attacks into two levels namely network and
In this survey, we chose a deep learning-based cybersecurity host level. They focused on the dynamic malicious attack and
attack detection concept. There are approximately 80 papers are large volumes of the dataset used when designing the Intrusion
selected interrelated to the survey topic. The effectiveness of Detection System(IDS). However, this technique suffered from a
this survey is analyzed and compared using various parame- high computational cost.
ters. In this section, the different kinds of algorithms, meth- Vasan et al. [72] presented a CNN approach for image-based
ods, platforms, advantages, and disadvantages based on the deep malware classification. This approach can identify different types
8
P. Dixit and S. Silakari Computer Science Review 39 (2021) 100317

Table 2 normal traffic. The experiments are performed using JAVA soft-
List of Acronyms in cybersecurity with Deep learning concept. ware and the detection process is centralized but not applicable
Abbreviations Detailed Form to analyze huge parameters.
AI Artificial Intelligence In [83], the supervised technique has improved outcomes in
DL Deep Learning
classification but is not appropriate for maximizing other stages.
IoT Internet of Things
CNN Convolutional Neural Network The dataset of NSL-KDD is implemented with the usage of the
AE Autoencoder Python platform. In [84], the author used the unsupervised learn-
DBN Deep Belief Network ing process. This method is more scalable and flexible but not
GAN Generative Adversal Network suitable for conducting a massive experiment. The performance
RNN Recurrent Neural Network
of their proposed concept is analyzed using gearbox datasets and
LSTM Long Short Term Memory
SCNN Singular Convolutional Neural Network implemented in the C++ programming language.
MCNN Multi-convolutional Neural Network The cybersecurity attack papers focused on the deep belief
VCC Variants of Convolutional Neural Network network are discussed in Table 5. Skopik et al. [86] discov-
ACNN Acoustic model of Convolutional Neural Network
ered that to prevent future networks from attacks, the infor-
LWCNN Limited Weight Sharing of Convolutional Neural Network
SAE Stacked Auto Encoder mation sharing between two organizations should be secured.
DAE Denoising Auto Encoder Even though it offers a secure information sharing, it is costly
VAE Variational Auto Encoder and deployable only in critical infrastructures. Zhang et al. [90]
DBM Deep Boltzmann Machine presented a DBNB and support Vector Machine powered network
RBM Restricted Boltzmann Machine
IDS system. Balakrishnan et al. [103] developed an Intelligent
DRBM Deep Restricted Boltzmann Machine
GRU Gated Recurrent Unit IDS using DBN to overcome the critical cyber attacks held in
ARNN Acoustic Model of RNN the IoT environment. The main intent of this model is to iden-
LSTM Long Short Term Memory tify the adversary’s activity inside the network, once they enter
BRNN Bidirectional Recurrent Neural Network
the border. This model can identify the data injection attacks
ML Machine Learning
TD Temporal Difference precisely when trained with the MNIST dataset. However, this
RIL Deep Reinforcement Learning model suffered from high computational cost and low accuracy.
QR Q-learning Reinforcement Thamilarasu and ShivenChawla [88] also proposed an intelligent
MAR Multi-Agent Reinforcement IDS for IoT systems using deep learning algorithm to identify
MTR Multi-Task Reinforcement
the malicious traffic. This method is more effective and feasible
AR Asynchronous Reinforcement
UK United Kingdom with a larger bandwidth. The dataset used is NSL-KDD and it is
DoS Denial of Service implemented in Matlab. The author used NSL-KDD, MNIST, and
DBN Deep Belief Network Kyoto datasets for the unsupervised deep learning process and
predicted the denial of service, and overflow attacks efficiently.
Even though this method offers high accuracy, complex hardware
of malware and classify it based on their family. The data variance implementation is required.
challenge is overcome here using an augmentation technique. The recurrent neural network papers about cybersecurity at-
This approach suffers from a slightly higher run time overhead. Li tacks are represented in Table 6. Nabil et al. [91] used the deep
feed-forward neural network and RNN to identify the consumers
et al. [61] presented an anti-steganalysis using CNN for detecting
who report false electricity usage. This mainly happens in the
malware in images. They focused on the Least Significant Bit
Advanced Metering Infrastructure(AMI) and it is also known as
based evolutionary algorithm attack and gradient-based attack. It
electricity theft cyber Attack. This approach is implemented in re-
shows significant ways to compromise the steganalysis and the
altime detectors using the Python platform to detect the contam-
weakness of the neural network. The summary of the cyberse-
ination attacks. Venkatraman and Vinayakumar [96] presented
curity attacks that take place in the CNN network is provided in
a hybrid deep learning architecture to detect the malware in
Table 3.
images. The main intent is to identify suspicious behavior along
The summary of the cybersecurity attacks that take place in
with the different hybrid architectures. The Malimg, VirusSign,
AE powered neural networks is shown in Table 4. Meira et al.
and VirusShare datasets are utilized for malware detection and
[80] used an unsupervised model of learning to detect unfamiliar
the procedure is implemented in both Java and Python. The out-
attacks by using an anomaly-based IDS. This method has bet- puts are often subjected to misclassification in this approach. In
ter performance and it is appropriate for the IDS, but the false [92], the supervised model used has parallel computation but low
positive rate is high. Experimentally, NSL-KDD datasets are used translation and encoding operations. The C++ languaging program
and the Python software is used for implementation. Thing [81] is used with the COCO dataset for training. In [93], the unsu-
focused on the security of IEEE 802.11 and identified the novel pervised learning model with the SST dataset is implemented by
threats and attacks conducted on it. To identify and classify the JAVA software. This method contains a Better forward procedure
anomalies they used an unsupervised Deep Learning Approach. and it is complex for huge data analysis. Salehinejad et al. [95]
The dataset used is AWID-CLS dataset and the approach is im- suggested a supervised model with the BABL dataset and OMNET
plemented using. Even though the classification performance is platforms are used. It is the simple and flexible but larger time
accurate, it has a large computational cost. Lopez-Martin et al. needed during weight transfer.
[82] proposed an IDS system to identify the malicious labels The cyber security attacks based on deep reinforcement learn-
present inside the decoder layer using a conditional variational ing is presented in Table 7. In [22], presented an unsupervised
autoencoder. It can create effective feature reconstruction and learning process with the CIFAR-10 dataset, and JAVA software
provides higher classification results. However, this method of- is used. This method is suitable for asynchronous demon attack
fered lower classification results. It is implemented in the OMNET detection employing lower training speed. Allen and Liu [99] pro-
platform using an STL-10 dataset. Diro and Chilamkurti [21] iden- posed a Monte Carlo Bayesian Reinforcement learning to reduce
tified the cyberattacks held in the IoT environment and mainly the median estimated learning time and provide faster learning.
focused on zero-day attacks. They analyzed the hidden patterns The cost of the system implementation is particularly high per
present in the training data to separate the malicious traffic from host and also while identifying the current threats in the system.
9
P. Dixit and S. Silakari Computer Science Review 39 (2021) 100317

Table 3
Convolutional neural Network for cybersecurity attack detection.
References Learning Model Platform Attacks Disadvantage Dataset
[69] Unsupervised Google Tensorflow Zero-day attack Interpretability of SWaT
framework attack detection is low
[73] Unsupervised Apache spark Denial of service and High computational KDDCup 99 [74]
cluster distributed denial of cost
service
[70] Supervised Python 2.7 – Low-performance MWS [75]
accuracy
Tariyal et al. (2016) Unsupervised Java Phishing attack Data missing occurred NIDS [76]
due to noise
[40] Unsupervised Python Web attacks(Brute Force, The multiple UNSW-NB15 [77],
SQL injection, and XSS), convolutional layers CICIDS2017 [78]
Infiltration, Heartbleed, present increases the
Backdoor, and worms. computational time
and cost
[41] Unsupervised Keras with Malware attacks (Worm, Fixed Segment length, Malimg
TensorFlow Trojan, Backdoor, etc.) Overfitting problem, (https://www.kaggle.
and Performance com/afagarap/malimg-
Degradation dataset) and Microsoft
from Kaggle
[72] Unsupervised Python Cyberattack(trojan, Runtime overhead Mailmg, and IoT android
backdoor, worm) mobile dataset
Shiyu Li et al. (2020) Unsupervised – Evolutionary algorithm Perturbations during BOSSbase [79]
attacks, and backpropagation
Gradient-based attacks

Table 4
Auto Encoder Network for cybersecurity attack detection.
References Learning Model Platform Attacks Disadvantage Dataset
[80] Unsupervised Python Unfamiliar attacks The high false-positive NSL-KDD [74]
rate
[81] Supervised Matlab Novel threats and High cost AWID-CLS [85]
attacks
[82] Supervised OMNET Denial of service The lower performance STL-10
of classification (https://ai.stanford.edu/
~acoates/stl10/)
[21] Unsupervised Java Zero-day attack Not suitable to huge NSL [74]
parameter
[83] Supervised Python – Not suitable to NSL-KDD [74]
optimize many layers
[84] Unsupervised C++ – Not suitable for a huge Gearbox
experiment

Table 5
Deep Belief Network for cybersecurity attack detection.
References Learning Model Platform Attacks Disadvantage Dataset
[86] Unsupervised Web Covert Cyberattacks Critical infrastructure ENISA
(http://data.europa.eu/
88u/dataset/enisa-threat-
taxonomy-1)
[87] Unsupervised Matlab Sinkhole attack Larger width NSL-KDD [74]
[88] Supervised Raspberry pi Denial of service and The high cost of computation Traffic dataset [88]
overflow attack
[89] Unsupervised C++ User root attack, High hardware requirement NSL-KDD [74],
remote to local attack MNIST,and Kyoto [76]
and probe attack
[90] Unsupervised – Network intrusion Needs improvement in accuracy CICIDS2017 [78]
attack

The dataset is composed of collecting malicious emails and the for autonomous vehicles to safeguard it from the cyber–physical
system is implemented using a LINUX platform. Nguyen and attacks. A game theory model is created between the adversary
JanapaReddi [94] reviewed the widely used Deep Reinforcement and the Autonomous vehicle. Here the adversary injects malicious
Learning(DRL) models to identify the cyber attacks in the system. data to the autonomous vehicle’s sensor to alter the optimal spac-
The DRL technique is widely used to solve the dynamic, intricate, ing and result in accidents. This system safeguards such an attack
and multidimensional security problem with a limited amount of by minimizing the adversaries’ spacing deviation. Yu et al. [101]
communication. Ferdowsi et al. [100] presented a DRL algorithm used DRL to combat real-time attacks such as illegal woodcutting,
10
P. Dixit and S. Silakari Computer Science Review 39 (2021) 100317

Table 6
Recurrent Neural Network for cybersecurity attack detection.
References Learning Model Platform Attacks Disadvantage Dataset
[91] Semi- Python Contamination attack Not suitable to hyper ENISA (http://data.europa.
supervised parametric analysis eu/88u/dataset/enisa-
threat-taxonomy-1)
[92] Supervised C++ – Lower translation and COCO [92]
encoding
[93] Unsupervised JAVA – Complex for huge data SST [94]
analysis
[95] Supervised OMNET – A larger time occurred BABL [95]
during weight transfer
[96] Supervised Apache Spark Malware attack Low performance Malimg
(https://www.kaggle.com/
afagarap/malimg-dataset)
[71] Supervised Python Malware attack High cost Ember, MalConv [73]
[97] Unsupervised Python Zero-day attack Damages occurred due Malimg
to misclassification (https://www.kaggle.com/
afagarap/malimg-dataset)

Table 7
Deep Reinforcement Learning Network for cybersecurity attack detection.
References Learning Model Platform Attacks Disadvantage Dataset
[22] Unsupervised JAVA Demon attack Low training speed CIFAR-10 [98]
[99] Semi- LINUX Denial attack High cost Email [99]
supervised
[94] Unsupervised OMNET Adversarial attack Only a limited amount SST [94]
of communication
attacks are detected
[100] Unsupervised Apache Spark – Temporal features only FLIR [100]
extracted
[101] Semi- Double Oracle Yolo attack A limited number of SSE [101]
supervised Framework iteration

Table 8
Generative Adversal Network for cybersecurity attack detection.
References Learning Model Platform Attacks Disadvantage Dataset
[23] Unsupervised MATLAB Denial attack Noise during overlap CIFAR-10 [98]
[30] Supervised C++ Partitioned attack Not suitable for various Malwr
tasks (https://malwr.com)
[102] Semi LINUX – Computational NSL-KDD [74]
supervised complexity
[102] Supervised Patching – Complex security SSE [101]
detection
[102] Supervised JAVA Zero-day attack Does not consider ISCX Botnet [102]
information based on
network payload

poaching, and overfishing. By means of using the deep Q network, NSL-KDD and is implemented in the LINUX platform. This method
the attacker is identified by the realtime information obtained. is highly robust against various kinds of attacks and it possesses
Realtime information is derived by obtaining the foodprints of the high computational complexity. Chhetri et al. [104] presented a
illegal member. GAN security model to overcome the cross-domain attacks that
Furthermore, the generative adversarial network for cyber- takes place in the cyber–physical system environment. The Op-
security attack detection is tabulated in Table 8. Radford et al. timum availability and integrity performance are accomplished
[23] presented a Deep CNN-GAN pair to identify the set of ma- with complex security identification. The botnet attack is one of
licious images. The unsupervised learning models have a hier- the large scale attacks that damage the cybersecurity. Yin et al.
archical representation with noises occurring during a denial [105] presented a Botnet based GAN(bot-GAN) to overcome the
attack. The CIFAR-10 datasets are implemented using MATLAB malicious attacks and detect the new botnets. The mechanism is
platform. It has hierarchical representation with noises that oc-
scalable and enhances botnet detection, but it never considers the
cur during overlap. In, [30] the supervised learning model for
payload information. The dataset of ISCX Botnet is used and the
malware identification using dynamic analysis using an up-to-
framework is implemented using JAVA software.
date dataset known as Malwr. Hence, it provides empirical and
classified evidence but it not suitable for different kinds of tasks.
The partitioned attack is detected effectively using this approach. 4.2. Comparative analysis
Lin et al. [102] presented an IDS based GAN network which
can create the adversarial attacks to compromise the IDS. The In this survey, approximately 101 papers are collected and 80
attack conducted is a type of black-box attack. The dataset used is papers were taken for technical analysis. Each of the papers is
11
P. Dixit and S. Silakari Computer Science Review 39 (2021) 100317

Fig. 9. Representation of cybersecurity attack detection papers chosen from


different journals.

Fig. 11. Analysis of cybersecurity attack detection papers based on deep learning
methods.

belong to the recurrent neural network domain. The number of


papers published based on the deep learning types is represented
in Fig. 11.
From this survey, the deep learning-based cybersecurity attack
detection papers are published from various countries such as
Italy, Turkey, India, USA, UK, Japan, Jordan, China, Nigeria, Israel,
Switzerland, Pakistan, Taiwan, Canada, Russia, Australia, and oth-
ers. Each of the countries has published more papers related to
deep learning concepts and among them 30% of papers belongs
to the United Kingdom (UK). The next 18% of papers are published
Fig. 10. Analysis of cybersecurity attack detection papers published every year. by China. The lowest 1% percentage of papers are published
by countries like Turkey, the USA, Jordan, Israel, Canada, and
Russia. The analysis of the paper published by each country is
referred to as the category of deep learning. The papers are se- represented in Fig. 12.
lected from different kinds of journals such as Elsevier, Springer, In this survey, we analyzed different kinds of attacks from the
IEEE, Sage, Conference, and others. The pie chart representation cybersecurity system using deep learning methods. The various
of papers selected from different kinds of journals is described in kinds of attacks like denial of service, probe, malware, zero-day,
Fig. 9. From this, 30% of papers are selected from Elsevier, 22% of phishing, sinkhole, user root, and others are discussed in the
papers are selected from springer, 25% of papers are selected from above sections. The number of papers chosen for every attack is
IEEE, and 8% deep learning method papers are selected from the formulated in Fig. 13. Here, we chose 12 papers for denial attack,
sage journals. Finally, the highest number of papers are chosen 3 for probe, 5 for malware, 11 for zero-day, 5 for phishing, 2 for
from the Elsevier journal, and the papers are collected from the sinkhole, 5 for user root, and remaining papers for other attacks.
cybersecurity and deep learning domain. Several papers are chosen for denial of service attacks.
The cybersecurity attack detection papers which focus on deep
learning are selected from the year 2014 to 2020. In this survey, 5. Open issues and future research directions
we used approximately 80 papers for technical review. There are
eight papers are chosen from 2014, nine papers from 2015, ten All these survey papers were produced an effective deep learn-
papers from 2016, twelve papers from 2017, twenty-four papers ing method for security attack detection procedure. Each per-
from 2018, and the remaining eighteen papers from the year of formance results such as accuracy, precision, recall, sensitivity,
2019. The highest number of papers are selected from the year specificity, and acuteness are best and highly accepted but it
2018. Eight papers are taken from the year 2020. The number of contains few complications based on their method, platform,
papers chosen from every year is noted in Fig. 10. algorithms, etc... So the number of papers are introduced to solve
In this paper, the cybersecurity attack detection based on these issues successfully and here a few of the open challengeable
deep learning methods is categorized into six classes such as issues are scheduled as follows:
Convolutional Neural Network (CNN), Auto Encoder (AE), Deep
Belief Network (DBN), Recurrent Neural Network (RNN), Gener- • The inputs are regularly managed by the deep learning
ativeAdversal Network (GAN) and Deep Reinforcement Learning algorithm. So the deep learning parameter, topology, and
(DIL). We have selected 10% of papers from RNN, 7% of papers layer identification are complex.
from AE learning, 8.5% of papers based on BBN learning, 8.2% • The researchers should also focus on problems like how
papers from CNN learning, 3% of papers based on GAN learning, an attacker uses the Deep learning technique to enter the
4% of papers based on RIL learning and remaining papers are victim’s system which is already secured with deep learning
based on the concept of common deep learning. Several papers techniques.
12
P. Dixit and S. Silakari Computer Science Review 39 (2021) 100317

Fig. 12. Analysis of cybersecurity attack detection papers published by each country.

• To detect and classify an intrusion in the network one


should always consider the type of attacks held and the
classes to which it belongs. The deep learning methodology
designed to solve cybersecurity problems should not be
focused on a single problem(malware detection) alone. The
deep learning model should be combined with multiple
machine learning methodologies and encryption algorithms
to identify a large area of attack vectors. In the future, mul-
tiple deep learning models can be integrated into parallel to
improve performance.
• The performance is often deterred by the normal data to
malware ratio used in the training dataset. The performance
metrics of cybersecurity applications such as speed, suspen-
sion of data poisoning, storage consumption, True Positive Fig. 13. Analysis of paper selected for each cybersecurity attack.
Rate, and False Positive Rate should be analyzed to evaluate
the efficiency of the cybersecurity application created.
• To obtain a real malware dataset is very hard in realtime of deep learning methods and their applications. Each survey
and not an easier task to accomplish. The malware datasets paper is collected from different kinds of journals such as Elsevier,
available are mainly created by experimentation or reverse IEEE, Springer, Sage, Conference papers, and others. Approxi-
engineering of the virus. So, in the future, the research can mately 30% of papers were obtained from Elsevier journals and
be focused on experimenting with different open-source 10 of them were conference papers. Approximately 24 papers
datasets and benchmark models. were selected from the year of 2018. Ultimately, the country of
• The deep learning technique related to cybersecurity has the United Kingdom published several papers. In the future, we
higher cost complexity associated with it during error solv- plan to introduce an effective algorithm to solve the open issues
ing. Because, the deep learning techniques are similar to challenges and design a robust cybersecurity application.
black boxes, the main cause of the error is literally hard to
identify. In the future, the underlying causes of the attacks Declaration of competing interest
should be analyzed in detail to design an active learning
approach for cybersecurity applications. The authors declare that they have no known competing finan-
• High performance, GPU, larger storage, low false-positive cial interests or personal relationships that could have appeared
rate, and accurate information are the basic requirement of to influence the work reported in this paper.
the resource.
• The larger resources analytics required highly scalable, low References
power consumption, flexible, and local bandwidth algo-
rithms. [1] Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy
• Most of the designs are high computational cost, intractable, Katz, Andy Konwinski, Gunho Lee, et al., A view of cloud computing,
Commun. ACM 53 (4) (2010) 50–58.
and complicated hyper parametric structure.
[2] M. Chen, S. Mao, Y. Liu, Big data: A survey, Mob. Netw. Appl. 19 (2)
(2014) 171–209.
[3] Arwa Alrawais, Abdurrahman Alhothaily, Fog computing for the internet
6. Conclusion of things: Security and privacy issues, IEEE Internet Comput. 21 (2) (2017)
34–42.
The rapid development of cybersecurity attack detection based [4] V. Sundararaj, Optimal task assignment in mobile cloud computing by
on deep learning algorithms is summarized in this paper. The queue based ant-bee algorithm, Wirel. Pers. Commun. 104 (1) (2019)
173–197.
applications of deep learning in cybersecurity attacks are success- [5] S. Vinu, S. Muthukumar, R.S. Kumar, An optimal cluster formation based
fully discussed. In this survey, nearly 80 papers are selected from energy efficient dynamic scheduling hybrid MAC protocol for heavy traffic
the year 2014 to 2019. Here, we introduced several architectures load in wireless sensor networks, Comput. Secur. 77 (2018) 277–288.

13
P. Dixit and S. Silakari Computer Science Review 39 (2021) 100317

[6] V. Sundararaj, An efficient threshold prediction scheme for wavelet based [34] Yaniv Taigman, Ming Yang, Marc Aurelio Ranzato, Lior Wolf, Deepface:
ECG signal noise reduction using variable step size firefly algorithm, Int. Closing the gap to human-level performance in face verification, in: Pro-
J. Intell. Eng. Syst. 9 (3) (2016) 117–126. ceedings of IEEE Conference on Computer Vision and Pattern Recognition,
[7] V. Sundararaj, Optimised denoising scheme via opposition-based self- 2014, pp. 1701–1708.
adaptive learning PSO algorithm for wavelet-based ECG signal noise [35] Xiong Zhao, Cheng Cheng, Zhou Li, Karlekar Xu, Shen Pranata, Xing, 3D-
reduction, Int. J. Biomed. Eng. Technol. 31 (4) (2019) 325. Aided Deep Pose-Invariant Face Recognition, in: IJCAI, Vol. 2, No. 3, 2018,
[8] V. Sundararaj, V. Anoop, P. Dixit, A. Arjaria, U. Chourasia, P. Bhambri, p. 11.
MR. Rejeesh, R. Sundararaj, CCGPA-MPPT: Cauchy preferential crossover- [36] Junchi Zhang, Yue Zhang, DonghongJi, Mengchi Liu, Multi-task and multi-
based global pollination algorithm for MPPT in photovoltaic system, Prog. view training for end-to-end relation extraction, Neurocomputing 4
Photovolt. Res. Appl. (2020). (2019a).
[9] S. Russell, P. Norvig, Artificial intelligence: a modern approach, 2002. [37] Yu, Li, Recent progresses in deep learning based acoustic models,
[10] Wells, Lee Jaime, Camelio, Christopher Williams, Jules White, Cyber- IEEE/CAA J. Autom. Sin. 44 (3) (2017) 396–409.
physical security challenges in manufacturing systems, Manuf. Lett. 2 (2) [38] Ossama Abdel-Hamid, Li Deng, Dong Yu, Exploring convolutional neural
(2014) 74–77. network structures and optimization techniques for speech recognition,
[11] X.A. Larriva-Novo, M. Vega-Barbas, V.A. Villagrá, M.S. Rodrigo, Evaluation Interspeech 11 (2014) 73–75.
of cybersecurity data set characteristics for their applicability to neural [39] MdZahangir Alom, TarekTaha, Chris Yakopcic, Stefan Weisberg, Pahed-
networks algorithms detecting cybersecurity anomalies, IEEE Access 8 ingSidike, Most ShamimaNasrin, MahmudulHasan, Brian C. Van Essen,
(2020) 9005–9014. Abdul A.S. Awwal, Vijayan K. Asari, A state-of-the-art survey on deep
[12] Hsien-De Huang, TonTon, Hung-Yu Kao, R2-D2: color-inspired convolu- learning theory and architectures, Electronics 8 (3) (2019) 292.
tional neural network (CNN)-based android malware detections, in: IEEE [40] Hongpo Zhang, Lulu Huang, Chase Q. Wu, Zhanbo Li, An effective con-
International Conference on Big Data, Big Data, 2018, pp. 2633–2642. volutional neural network based on SMOTE and Gaussian mixture model
[13] S. Mahdavifar, A.A. Ghorbani, Application of deep learning to cybersecu- for intrusion detection in imbalanced dataset, Comput. Netw. (2020a).
rity: A survey, Neurocomputing 347 (2019) 149–176. [41] G. Xiao, J. Li, Y. Chen, K. Li, Malfcs: An effective malware classifi-
[14] D.S. Berman, A.L. Buczak, J.S. Chavis, C.L. Corbett, A survey of deep cation framework with automated feature extraction based on deep
learning methods for cyber security, Information 10 (4) (2019) 122. convolutional neural networks, J. Parallel Distrib. Comput. (2020).
[15] S. KP, M. Alazab, A comprehensive tutorial and survey of applications of [42] Baldi, Pierre, Auto encoders, unsupervised learning, and deep architec-
deep learning for cyber security, 2020. tures, in: Proceedings of ICML workshop on unsupervised and transfer
[16] Komal Jaswal, TanupriyaChoudhury, RoshanLalChhokar, SoorajRandhir learning, 2014, pp. 37–49.
Singh, Securing the Internet of Things: A proposed framework, in: IEEE: [43] Jonathan Masci, Ueli Meier, Dan Ciresan, Schmidhuber, Stacked convolu-
International Conference on Computing, Communication and Automation, tional auto-encoders for hierarchical feature extraction, in: International
ICCCA, 2017, pp. 1277–1281. Conference on Artificial Neural Networks, 2014, pp. 52–59.
[17] Deng Li, Wang Gupta, Choi, A novel CNN based security guaranteed image [44] Yoshua Bengio, Li Yao, Guillaume Alain, Pascal Vincent, Generalized
watermarking generation scenario for smart city applications, Inform. Sci. denoising auto-encoders as generative models, in: Advances in Neural
479 (2019b) 432–447. Information Processing Systems, 2014, pp. 899–907.
[18] Kavukcuoglu Mnih, Rusu Silver, Bellemare Veness, Riedmiller Graves, [45] Xing Fang, MaochaoXu, ShouhuaiXu, Peng Zhao, A deep learning frame-
Ostrovski Fidjeland, Petersen, Human-level control through deep work for predicting cyber attacks rates, EURASIP J. Inf. Secur. 1 (5)
reinforcement learning, Nature 518 (7540) (2019) 529. (2019).
[19] G. Parekh, D. DeLatte, G.L. Herman, L. Oliva, D. Phatak, T. Scheponik, A.T. [46] Yunchen Pu, ZheGan, Ricardo Henao, Xin Yuan, Chunyuan Li, Andrew
Sharman, Identifying core concepts of cybersecurity: Results of two delphi Stevens, Lawrence Carin, Variational auto encoder for deep learning
processes, IEEE Trans. Educ. 61 (1) (2018) 11–20. of images, labels and captions, in: Advances in Neural Information
[20] G. Wu, J. Sun, Optimal switching integrity attacks in cyber-physical Processing Systems, 2016, pp. 2352–2360.
systems, in: 2017 32nd Youth Academic Annual Conference of Chinese [47] Abdel-Rahman Mohamed, George E. Dahl, Geoffrey Hinton, Acoustic
Association of Automation, YAC, IEEE, 2017, pp. 709–714. modelling using deep belief networks, IEEE Trans. Audio Speech Lang.
[21] Diro, Chilamkurti, Distributed attack detection scheme using deep learn- Process. 20 (1) (2014) 14–22.
ing approach for Internet of Things, Future Gener. Comput. Syst. 82 (2018) [48] Qin Zhang, Ou Yin, Zhang, A feature-hybrid malware variants detection
761–768. using CNN based opcode embedding and BPNN based API embedding,
[22] Volodymyr Mnih, AdriaPuigdomenechBadia, Mehdi Mirza, Alex Graves, Comput. Secur. 84 (2019b) 376–392.
Timothy Lillicrap, Tim Harley, David Silver, KorayKavukcuoglu, Asyn- [49] Bontupalli Alom, Taha, Intrusion detection using deep belief networks,
chronous methods for deep reinforcement learning, in: International in: 2015 National Aerospace and Electronics Conference, NAECON, 2015,
Conference on Machine Learning, 2016, pp. 1928–1937. pp. 339-344.
[23] Alec Radford, Luke Metz, SoumithChintala, Unsupervised representation [50] Ugo Fiore, Francesco Palmieri, Network anomaly detection with the
learning with deep convolutional generative adversarial network, 2015, restricted Boltzmann machine, Neurocomputing 122 (3) (2014) 13–23.
ArXiv preprint arXiv:1511.06434. [51] J. Yang, J. Deng, S. Li, Hao, Improved traffic detection with support vector
[24] Cao Xiong, Q. Yu, Reinforcement learning-based real-time power man- machine based on restricted Boltzmann machine, Soft Comput. 21 (11)
agement for hybrid energy storage system in the plug-in hybrid electric (2017a) 3101–3112.
vehicle, Appl. Energy 1 (211) (2018) 538–548. [52] Ying Zhang, Peisong Li, Xinheng Wang, Intrusion detection for IoT based
[25] H. Xu, Y. Ma, H. Liu, D. Deb, H. Liu, J. Tang, A. Jain, Adversarial attacks on improved genetic algorithm and deep belief network, IEEE Access 7
and defenses in images, graphs and text: A review, 2019, arXiv preprint (2019c) 31711–31722.
arXiv:1909.08072. [53] McDaniel Papernot, Swami, Harang, Crafting adversarial input se-
[26] Z. Katzir, Y. Elovici, Gradients cannot be tamed: Behind the impossible quences for recurrent neural networks, in: IEEE Military Communications
paradox of blocking targeted adversarial attacks, IEEE Trans. Neural Netw. Conference, 2016, pp. 49–54.
Learn. Syst. (2020). [54] Razvan Pascanu, Jack Stokes, MadyMarinescu HerminehSanossian, Anil
[27] S. Mahloujifar, D.I. Diochnos, M. Mahmoody, Learning under p-tampering Thomas, Malware classification with recurrent networks, in: IEEE Inter-
poisoning attacks, Ann. Math. Artif. Intell. (2019) 1–34. national Conference on Acoustics, Speech and Signal Processing, ICASSP,
[28] W. Jiang, H. Li, S. Liu, X. Luo, R. Lu, Poisoning and evasion attacks 2015, pp. 1916–1920.
against deep learning algorithms in autonomous vehicles, IEEE Trans. Veh. [55] Hammed HaddadPajouh, Ali Dehghantanha, RaoufKhayami, Kim-
Technol. 69 (4) (2020) 4439–4449. Kwang Raymond Choo, A deep Recurrent Neural Network based
[29] S. Sihag, A. Tajer, Secure estimation under causative attacks, IEEE Trans. approach for Internet of Things malware threat hunting, Future Gener.
Inform. Theory (2020). Comput. Syst. 85 (2018) 88–96.
[30] Ziv Katzir, Yuval Elovici, Quantifying the resilience of machine learning [56] Rosenberg Shabtai, Rokach, Elovici, Generic black-box end-to-end attack
classifiers used for cyber security, Expert Syst. Appl. 92 (2018) 419–429. against state of the art API call based malware classifiers, in: International
[31] Daming Li, Lianbing Deng, BrijBhooshan Gupta, Haoxiang Wang, Chang Symposium on Research in Attacks, Intrusions, and Defences, 2017, pp.
Choi, A novel CNN based security guaranteed image watermarking gen- 490–510.
eration scenario for smart city applications, Inform. Sci. 479 (2019a) [57] Jihyun Kim, Jaehyun Kim, Huong Le Thi Thu, Howon Kim, Long short
432–447. term memory recurrent neural network classifier for intrusion detection,
[32] Ma Li, Jiao, A hybrid malicious code detection method based on deep in: 2016 International Conference on Platform Technology and Service,
learning, Int. J. Secur. Appl. 9 (5) (2015) 205–216. 2016, pp. 1–5.
[33] Usama Ahmed, Imran Raza, Syed AsadHussain, Amjad Ali, Modelling cyber [58] Senior Sak, Rao, Beaufays, Fast and accurate recurrent neural network
security for software-defined networks those grow strong when exposed acoustic models for speech recognition, 2015, ArXiv preprint arXiv:1507.
to threats, J. Reliab. Intell. Environ. 1 (2–4) (2012) 123–146. 06947.

14
P. Dixit and S. Silakari Computer Science Review 39 (2021) 100317

[59] Wei Feng, Yuqin Wu, Yexian Fan, A new method for the prediction of [83] Bo Du, Wei Xiong, Jia Wu, Lefei Zhang, Stacked convolutional denoising
network security situations based on recurrent neural network with gated auto-encoders for feature representation, IEEE Trans. Cybern. 47 (4)
recurrent unit, Int. J. Intell. Comput. Cybern. 11 (4) (2018) 511–525. (2016) 1017–1027.
[60] Jianhua Yang, Kai Liu, Xiangui Kang, Edward K. Wong, Yun-Qing Shi, [84] Guifang Liu, HuaiqianBao, Baokun Han, A stacked auto encoder-based
Spatial image Steganography based on generative adversarial network, deep neural network for achieving gearbox fault diagnosis, Math. Probl.
2018, ArXiv preprint arXiv:1804.07939. Eng. (2018).
[61] S. Li, D. Ye, S. Jiang, C. Liu, X. Niu, X. Luo, Anti-steganalysis for image on [85] U.S.K.P.M. Thanthrige, J. Samarabandu, X. Wang, Machine learning tech-
convolutional neural networks, Multimedia Tools Appl. (2018b) 1–17. niques for intrusion detection on public dataset, in: 2016 IEEE Canadian
[62] Dengyu Xiao, Yixiang Huang, Xudong Zhang, Haotian Shi, Chengliang Conference on Electrical and Computer Engineering, CCECE, IEEE, 2016,
Liu, Yanming Li, Fault diagnosis of asynchronous motors based on LSTM pp. 1–4.
neural network, in: 2018 Prognostics and System Health Management [86] Florian Skopik, Giuseppe Settanni, Roman Fiedler, A problem shared is a
Conference, 2018, pp. 540–545. problem halved: A survey on the dimensions of collective cyber defence
[63] Niyaz Javaid, Sun, Alam, A deep learning approach for network intrusion through security information sharing, Comput. Secur. 60 (2016) 154–176.
detection system, in: Proceedings of the 9th EAI International Conference [87] Huaizhi Wang, JiaqiRuan, Zhengwei Ma, Bin Zhou, Xueqian Fu,
on Bio-inspired Information and Communications Technologies, 2016, pp. Guangzhong Ca, Deep learning aided interval state prediction for
21–26. improving cyber security in energy internet, Energy 174 (2019)
[64] AkankshaRai Sharma, PranavKaushik, Literature survey of statistical, deep 1292–1304.
and reinforcement learning in natural language processing, in: IEEE [88] Geethapriya Thamilarasu, ShivenChawla, Towards deep-learning-driven
International Conference on Computing, Communication and Automation, intrusion detection for the internet of things, Sensors 19 (1977) (2019).
ICCCA, 2017, pp. 350–354. [89] Khaled Alrawashdeh, Carla Purdy, Fast hardware assisted online learn-
[65] Jiang, Lu, Learning intentional communication for multi-agent coopera- ing using unsupervised deep learning structure for anomaly detec-
tion, in: Advances in Neural Information Processing Systems, 725, 2018, tion, in: 2018 International Conference on Information and Computer
pp. 4–7264. Technologies, 2018, pp.128-134.
[66] Li Xiao, He Zhu, Liu, Song, Generating adversarial examples with [90] H. Zhang, Y. Li, Z. Lv, A.K. Sangaiah, T. Huang, A real-time and ubiquitous
adversarial networks, 2018b, ArXiv preprint arXiv:1801.02610. network attack detection based on deep belief network and support
[67] Li, Yuxi, Deep reinforcement learning: An overview, 2017, ArXiv preprint vector machine, IEEE/CAA J. Autom. Sin. 7 (3) (2020b) 790–799.
arXiv:1701.07274. [91] Mahmoud Nabil, Muhammad Ismail, Mohamed Mahmoud,
[68] Dan Li, Dacheng Chen, Jonathan Goh, See-kiong Ng, Anomaly detection MostafaShahin, Khalid Qaraqe, ErchinSerpedin, Deep learning-based
with generative adversarial networks for multivariate time series, 2018a, detection of electricity theft cyber-attacks in smart grid AMI networks,
ArXiv preprint arXiv:1809.04758. in: Deep Learning Applications for Cyber Security, 2019, pp. 73–102.
[69] Moshe Kravchik, AsafShabtai, Detecting cyber attacks in industrial control [92] Zachary Lipton, John Berkowitz, Charles Elkan, A critical review of
systems using convolutional neural networks, in: Proceedings of the 2018 recurrent neural networks for sequence learning, 2015, ArXiv preprint
Workshop on Cyber-Physical Systems Security and Privacy, 2018, pp. arXiv:1506.00019.
72–83. [93] Hazarika Young, Poria, Cambria, Recent trends in deep learning based
[70] Mamoru Mimura, Hidema Tanaka, Heavy log reader: learning the context natural language processing, IEEE Comput. Intell. Mag. 13 (3) (2018)
of cyber attacks automatically with paragraph vector, in: International 55–75.
Conference on Information Systems Security, 2017, pp. 146–163. [94] ThanhThi Nguyen, Vijay JanapaReddi, Deep reinforcement learning for
[71] Alazab Vinayakumar, Poornachandran Soman, Venkatraman, Robust in- cyber security, 2019, ArXiv preprint ArXiv:1906.05799.
telligent malware detection using deep learning, IEEE Access 7 (2019a) [95] Sankar Salehinejad, Colak Barfett, Valaee, Recent advances in recurrent
46717–46738. neural networks, 2017, ArXiv preprint arXiv:1801.01078.
[72] D. Vasan, M. Alazab, S. Wassan, H. Naeem, B. Safaei, Q. Zheng, IMCFN: [96] Alazab Venkatraman, Vinayakumar, A hybrid deep learning image-based
Image-based malware classification using fine-tuned convolutional neural analysis for effective malware detection, J. Inf. Secur. Appl. 47 (2019)
network architecture, Comput. Netw. 171 (2020) 107138. (2019) 377–389.
[73] MamounAlazab Vinayakumar, PrabaharanPoornachandran Soman, Ameer [97] Elaine Raybourn, Michael Kunz, David Frit, Vince Urias, A zero-entry cyber
Al-Nemrat, SitalakshmiVenkatraman, IEEE Access 7 (2019b) 41525–41550, range environment for future learning ecosystem, in: Cyber-Physical
Vinayakumar. Systems Security, 2018, pp. 93–109.
[74] H.S. Chae, B.O. Jo, S.H. Choi, T.K. Park, Feature selection for intrusion [98] L.N. Darlow, E.J. Crowley, A. Antoniou, A.J. Storkey, CINIC-10 is not
detection using NSL-KDD, Recent Adv. Comput. Sci. 18 (2013) 4–187. imagenet or CIFAR-10, 2018, arXiv preprint arXiv:1810.03505.
[75] M. Hatada, M. Akiyama, T. Matsuki, T. Kasama, Empowering anti-malware [99] Roychowdhury Allen, Liu, Reward-based Monte Carlo-Bayesian reinforce-
research in Japan by sharing the MWS datasets, J. Inf. Process. 23 (5) ment learning for cyber preventive maintenance, Comput. Ind. Eng. 126
(2015) 579–588. (2018) 578–594.
[76] J. Song, H. Takakura, Y. Okabe, M. Eto, D. Inoue, K. Nakao, Statistical [100] Aidin Ferdowsi, Ursula Challita, WalidSaad, Narayan B. Mandalay, Robust
analysis of honeypot data and building of Kyoto 2006+ dataset for NIDS deep reinforcement learning for security and safety in autonomous
evaluation, in: Proceedings of the First Workshop on Building Analysis vehicle systems, in: IEEE International Conference on Intelligent
Datasets and Gathering Experience Returns for Security, 2011, pp. 29–36. Transportation Systems, ITSC, 2018, pp.307–312.
[77] N. Moustafa, J. Slay, UNSW-NB15: a comprehensive data set for network [101] Lantao Yu, Yi Wu, Rohit Singh, Lucas Joppa, Fei Fang, Deep reinforcement
intrusion detection systems (UNSW-NB15 network data set), in: 2015 learning for green security game with online information, in: Workshops
Military Communications and Information Systems Conference, MilCIS, at the Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
IEEE, 2015, pp. 1–6. [102] Zilong Lin, Yong Shi, ZhiXue, Idsgan: Generative adversarial networks
[78] R. Panigrahi, S. Borah, A detailed analysis of CICIDS2017 dataset for for attack generation against intrusion detection, 2018, ArXiv preprint
designing intrusion detection systems, Int. J. Eng. Technol. 7 (3) (2018) arXiv:1809.02077.
479–482, 24. [103] Rajendran Balakrishnan, Pelusi, Ponnusamy, Deep belief network en-
[79] J. Yang, Y.Q. Shi, E.K. Wong, X. Kang, JPEG steganalysis based on densenet, hanced intrusion detection system to prevent security breach in the
2017b, arXiv preprint arXiv:1711.09335. internet of things, Internet Things (2019) 100112.
[80] Andrade Meira, Carneiro Praca, Alonso-Betanzos Bolón-Canedo, Mar- [104] SujitRokka Chhetri, Anthony Bahadir Lopez, Jiang Wan, Mohammad Ab-
reiros, Performance evaluation of unsupervised techniques in cyber-attack dullah Al Faruque, GAN-Sec: Generative adversarial network modelling
anomaly detection, J Amb. Intell. Huma. Comput. (2019) 1–13. for the security analysis of cyber-physical production systems. IEEE:
[81] Thing, Network anomaly detection and attack classification: A deep Automation and Test in Europe Conference and Exhibition, DATE, 2019,
learning approah, in: IEEE Wireless Communications and Networking pp. 770–775.
Conference, 2017, pp. 1–6. [105] Chuanlong Yin, Yuefei Zhu, Shengli Liu, JinlongFei, He tong Zhang, An
[82] Carro Lopez-Martin, Sanchez-Esguevillas, Lloret, Sensors 17 (9) (2017) enhancing framework for bonnet detection using generative adversarial
1967. networks, in: 2018 International Conference on Artificial Intelligence and
Big Data, 2018, pp 228–234.

15

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy