An Entropy and Machine Learning Based Approach For Ddos Attacks Detection in Software Defined Networks
An Entropy and Machine Learning Based Approach For Ddos Attacks Detection in Software Defined Networks
com/scientificreports
SDN is a cutting-edge approach to network management architecture that enhances adaptability, programma-
bility, and responsiveness to dynamic service and application needs. It centralizes the control plane, managing
data traffic routing and communication between network parts. Traditional network topologies integrate control
planes into routers and switches, while SDN control is centralized in a software-based controller, with control
and data planes separated1.
The intentional attempt to disrupt traffic on the network is known as a DDoS attack. Disrupting the availability
and quality of service (QoS) of essential services by flooding the targeted system’s bandwidth and resources2.
Attackers exploit SDN’s centralized control plane, a single point of failure, to create undetected traffic patterns,
making detection and mitigation of DDoS attacks c hallenging3.
Although SDNs are more flexible and manageable, because of their centralized control and programmable
nature, they are also vulnerable to DDoS attacks. Due to the dynamic and complicated nature of network traf-
fic patterns and the attackers’ constantly changing techniques. It might be difficult to detect and mitigate these
attacks in SDN systems.
Figure 1 shows the DDoS attack mitigation main steps: traffic routing, attack fingerprint detection, response,
and machine learning adaptation4. Lately, DDoS attacks have become more challenging to detect due to their
variety. For example, multi-vector attacks, where a combination of multiple attack protocols is common. There-
fore, more robust defense techniques are required.
The umbrella of DDoS detection and mitigation is wide. It includes several approaches, such as statistical-
based ones, that aim to protect networks by analyzing and collecting flow-related statistics5. They can be applied
in several ways, such as entropy-based techniques, trust management, time-series analysis, and anomaly detec-
tion. However, these methods are facing limitations in accurately distinguishing normal and malicious traffic
and adjusting to emerging threats. This may lead to potential false positives or false n egatives6.
Machine learning (ML) algorithms showed a promising potential for detecting and mitigating DDoS attacks7.
These approaches can analyze network traffic in real time, identify malicious behavior, and adapt detection
models based on real-time network data. In addition, they can handle large volumes of network data and detect
1
Institute of Graduate Studies and Research, Alexandria, Egypt. 2These authors contributed equally: Amany I. Hassan,
Eman Abd El Reheem and Shawkat K. Guirguis. *email: igsr.amany_ibrahim@alexu.edu.eg
Vol.:(0123456789)
www.nature.com/scientificreports/
zero-day attack patterns. It makes them an effective defense against evolving and sophisticated DDoS attacks.
On the other hand, ML-based DDoS detection approaches face limitations due to imbalanced datasets and the
dynamic nature of attacks8. Imbalances can lead to biased behavior and reduced accuracy in recognizing novel
attack patterns as discussed by Ullah et al.9. Continuous model retraining and adaptation to evolving threats are
necessary to maintain e fficacy10. ML presents an added privilege to traditional detection techniques since there
is no single detection method that can provide 100% accuracy11.
Blockchain technology is proposed for DDoS detection and mitigation, enhancing network security, scal-
ability, and performance. However, their use in DDoS detection and mitigation faces challenges like potential
latency and scalability12.
The motivation behind this research is to develop a novel model that improves DDoS attack detection and
mitigation in SDN networks by utilizing system entropy and ML clustering techniques. System entropy, a measure
of disorder or randomness within a system, presents valuable details regarding the normal behavior of network
traffic. Anomalies indicative of DDoS attacks can be identified by tracking variations in entropy levels.
However, depending only on statistical techniques, such as entropy-based detection, could not be accurate
or responsive enough, especially in large-scale and dynamic SDN systems6. In addition to complementing the
statistical method, ML clustering techniques like the K-means algorithm enable the analysis of complex patterns
and the identification of anomalous network activity groups.
The combination of system entropy with ML clustering techniques aims to create a comprehensive defense
and address several key challenges in DDoS attack detection and mitigation:
a. Real-time Detection: The model can identify DDoS attacks in real-time, allowing for quick response and
mitigation measures, by continuously monitoring system entropy and utilizing machine learning clustering
methods.
b. Adaptability: DDoS attack techniques are always changing; therefore, detection systems must adjust and
detect new attack patterns. By using machine learning clustering techniques, the model’s resilience against
new threats is increased since it may dynamically modify its detection skills in response to observed network
behavior.
c. Scalability: Several network devices and a variety of traffic patterns are common in SDN systems. The sug-
gested approach is made to be scalable to manage the complexity and volume of network traffic, providing
accurate detection and extensive coverage throughout the network.
1. Surveying the state-of-the-art literature on detecting intrusions in SDN environments and analyzing the
research gap.
2. Proposing a novel approach that combines a statistical entropy-based technique with an ML one for the
effective detection and mitigation of DDoS attacks. This leads to a comprehensive SDN defense mechanism.
This mechanism can mitigate attacks in real time, adapt to evolving attack strategies, and minimize false
positives.
3. Conducting an empirical evaluation by utilizing three modern SDN datasets, i.e. CIC-IDS2017, CSE-
CIC-2018, and CICIDS2019.
4. Benchmarking with other anomaly detection approaches using three recent public datasets.
5. Reducing false positives and minimizing the impact of false alarms on network operations by combining
system entropy with machine learning clustering.
6. Enhancing the efficiency of DDoS detection while improving network resilience and security posture in the
face of evolving cyberattacks.
The remaining of this paper is as follows, "Related work" section discusses the related work. "Methods"
section demonstrates the proposed methodology for DDoS detection and mitigation. "Evaluation scheme" sec-
tion provides an in-depth analysis of the evaluation environment used in this study, outlining the dataset, data
preprocessing methodologies, and performance metrics for transparent analysis. "Results" section describes the
experimental setup and proposed methodology evaluation and compares the results with the state-of-the-art
Vol:.(1234567890)
www.nature.com/scientificreports/
approaches. Finally, the conclusion is provided in Section "Conclusion and future work" in addition to further
probable directions for future research.
Related work
This section examines different defense mechanisms for detecting DDoS attacks. It emphasizes the significance
of statistical, machine learning, deep learning, and blockchain approaches.
The survey of the literature on DDoS attacks detection and mitigation techniques, as listed in the introduction
section, follows the methodological approaches detailed in the literature review methods outlined in Saied et al13.
Vol.:(0123456789)
www.nature.com/scientificreports/
computing strategy not only utilized advanced machine learning techniques but also emphasized the significance
of entropy analysis in DDoS detection and mitigation. By incorporating entropy considerations into packet flow
analysis, their technique provided a promising avenue for enhancing the effectiveness and adaptability of DDoS
detection and defense mechanisms within network environments.
In contrast, the proposed model presents a more automated, scalable, and adaptable approach to DDoS attack
detection, potentially overcoming limitations associated with SVM classifier-based strategies.
Hannache et al.22 developed a cutting-edge Neural Network-based Traffic Flow Classifier (TFC-NN) aimed
at real-time detection of DDoS attacks. Their model was trained and implemented using a dataset comprising
both regular and malicious traffic, and it was deployed within a real SDN architecture. Impressively, the TFC-
NN achieved a remarkable global accuracy rate of 96.13%, showcasing its effectiveness in accurately discerning
between normal and malicious network activity.
In another study, Cui et al.23 utilized clustering technologies such as K-means to identify malicious traffic
within network streams. They further evaluated the efficacy of their approach by assessing communication
latency, detection accuracy, and defense effectiveness. By leveraging packet-in message registers to filter out mali-
cious traffic, they demonstrated the practical applicability of their scheme in real-world network environments.
Furthermore, Gu et al.24 introduced a sophisticated hybrid feature selection-based, semi-supervised K-means
detection technique. This approach not only addressed the challenges posed by outliers and local optimality but
also incorporated an enhanced density-based initial cluster center selection procedure. By integrating feature
selection strategies and semi-supervised learning techniques, they aimed to enhance the robustness and accu-
racy of DDoS attack detection within SDN networks, highlighting the importance of advanced methodologies
in mitigating cyber threats effectively.
Vol:.(1234567890)
www.nature.com/scientificreports/
only three fully connected layers for training completion. Experimental evaluation on the CIC-DDoS2019 dataset
demonstrated impressive accuracy, reaching 95%. This highlights the effectiveness of DNN-based approaches
in accurately classifying network traffic patterns, promising improved network security and anomaly detection
capabilities. Moreover, the model’s simplicity and high accuracy suggest its potential for practical deployment
in real-world network environments, offering a valuable tool for enhancing network monitoring and defense
systems.
A fully connected (FC) architecture was used in an explainable DL framework for the Industrial Internet of
Things (IIoT) developed by Khan et al.32. An autoencoder-based detection framework that used convolutional
and recurrent networks was proposed to identify cyber threats in IIoT networks. To enhance the learning of data
features, the framework used a two-step sliding window (SW) method to extract temporal and spatial charac-
teristics for attack event categorization and explanation. When compared to modern techniques, the empirical
results showed how well the framework extracted contextual elements of dangerous patterns and how robust it
was in identifying malicious events. The model performed better than state-of-the-art techniques, incorporated
explanation mechanisms for model decisions, and presented an inventive approach for addressing cyber threats
in IIoT networks. Additionally, Khan et al.33 introduced a deep-autoencoder-based intrusion detection system
(IDS) for IIoT networks. The model, based on an LSTM auto-encoder design, accurately identified invasive
events in real time. The model outperformed existing methods, achieving accuracy rates of 97.95% and 97.62%
on benchmark datasets like the gas pipeline and UNSWNB-15.
Khan et al.34 also proposed a privacy-conserving intrusion detection framework named PC-IDS, tailored for
securing Contemporary Smart Power Systems (SPNs) against cyber-attacks. The model used a hybrid machine
learning approach, transforming raw data into a privacy-preserving format, and identifying malicious events
using a probabilistic neural network. The framework outperformed existing methods in terms of false positive
rate, detection rate, and computational processing time, enhancing SPN security and privacy. The framework
achieved detection rates of 96.03% and 95.91%, demonstrating its effectiveness in enhancing SPN security. Addi-
tionally, Khan et al.35 highlighted the importance of protecting industrial control systems (ICSs) against cyber-
attacks due to their integration with IoT technologies. The paper proposed a novel intrusion detection system
(IDS) model called federated-simple recurrent units (SRUs) for IoT-based ICSs. The model used simple recurrent
units’ architecture to mitigate computational costs and address gradient vanishing issues. Experimental validation
showed that the model accurately detected intrusions in real-time without compromising privacy or security.
In comparison, the proposed model offers scalability, efficiency, and real-time detection. It can also adapt to
changing network conditions and quickly identify anomalies, reducing processing time and resource usage. The
entropy-based model with k-means clustering balances accuracy and efficiency, allowing for timely detection
without significant processing delays.
a. Integration of System Entropy: While many articles only discuss statistical or ML techniques for DDoS
detection, the suggested model incorporates system entropy as a fundamental element. A measure of the
network’s disorder, system entropy sheds light on how network traffic typically behaves. The model improves
its capacity to recognize anomalies indicative of DDoS attacks by adding entropy analysis to the detection
procedure, resulting in more precise and rapid detection.
b. Real-Time Detection and Mitigation: Enabling real-time DDoS attack detection and mitigation is one of
the main objectives of the proposed model. The model can detect anomalies and start mitigation measures
promptly by continuously monitoring system entropy and applying ML clustering algorithms. This real-time
responsiveness is essential for minimizing the impact of DDoS attacks on network availability and perfor-
mance.
c. Reduced False Positives: In DDoS detection, false positives are a common challenge that can result in needless
alert fatigue and resource usage. By integrating machine learning clustering techniques with system entropy,
Vol.:(0123456789)
www.nature.com/scientificreports/
the suggested model aims to address this issue. By analyzing several aspects of network traffic and detecting
abnormal behavior clusters, the model can lower false positives and enhance DDoS detection precision.
d. Experimental Validation: Experimental assessment of modern SDN environment-specific datasets, including
CIC-IDS2017, CSE-CIC-2018, and CICIDS2019 strengthens the suggested model. These experiments show
the model’s practical application and performance by offering empirical evidence of its ability to identify
and mitigate DDoS attacks in real-world scenarios.
Methods
This section introduces the proposed scheme. It describes the overall design of the proposed comparative scheme,
including the entropy-based, machine learning algorithms employed, and the system’s main parameters.
The proposed scheme consists of entropy-based and machine-learning clustering modules, strategically inte-
grated to enhance the effectiveness of DDoS attack detection and mitigation. Figure 2 illustrates the compre-
hensive architecture of this model. The primary step involves data preprocessing which is performed through
the following sequence:
– These network flow attributes enable packets, whether normal or anomalous, to be specified.
– The entropy values are then compared to a preset threshold to locate the presence of anomalies in these
features.
Vol:.(1234567890)
www.nature.com/scientificreports/
Subsequently, the request flows are divided into multiple time intervals of equal duration, denoted as T. After
each time interval, the entropy detection module comes into play, assessing whether an attack occurred in the
last interval. In the case of a detected attack, the machine learning technique starts the clustering of active users
within that interval into three distinct clusters: normal users, suspicious users, and attackers. This sequential
process emphasizes the model’s dynamic responsiveness to potential attacks, integrating both entropy-based
detection and machine learning clustering for a comprehensive defense mechanism.
The module sets a lower threshold βlower for the system’s overall entropy H(X), which indicates that the sys-
tem is under attack. Figure 3 shows the system entropy during the DDoS attack and the entropy of the normal
users only in the same intervals on a part of the CIC-IDS2017 dataset with an interval size of 120 s. The figure
shows a noticeable decrease in entropy after the DDoS attack started, yet sometimes both entropy values were
near each other, possibly due to having slow attack rates. This interaction between the system and normal user
entropy provides valuable insights into the changes in the DDoS attack, particularly during intervals marked by
subtle variations in attack intensity.
The first procedure, the attack detection has O(n) time complexity and O(n) space complexity, where n is
the number of users in the interval. The detection procedure consists of a clustering procedure and an attacker
detection procedure. The clustering has a time complexity of O ( K × n × I ) where k is the number of clusters
(3 in our case) and I the number of iterations which is fixed in our case, so overall the clustering has O(n) time
complexity and O(n) space complexity. The attacker detection procedure has O(n) time complexity (using Algo-
rithm 3.2). Therefore, the overall time complexity for the system is O(n) and O(n) space complexity, where n is
the number of active users in the interval.
Algorithm 3.1 encapsulates the comprehensive detection procedure, offering a brief overview of the steps
involved in identifying potential attackers within the system.
Figure 3. Entropy for the system during DDoS and entropy for the normal users in the same intervals on the
CIC-IDS2017 dataset.
Vol.:(0123456789)
www.nature.com/scientificreports/
Vol:.(1234567890)
www.nature.com/scientificreports/
Parameter Description
Interval Size The duration after which the system considers one interval done
Entropy lower bound threshold (βlower) The system entropy below which entropy considers the system to be under attack
Suspicious cluster entropy change delta (δsusp) The entropy ratio below which the system considers the suspected user as an attacker
The entropy ratio below which the system considers the possible attacker as an attacker
Attacker cluster entropy change delta (δattack)
when it is tested against the normal users’ k-means clustering group
Vol.:(0123456789)
www.nature.com/scientificreports/
Dataset Size Duration Format Publicly available Contains attack traffic Attack types
CICIDS201741 50 GB 5 days Packet yes yes DoS, DDoS
CSECIC201842 500 GB 2 days Packet yes yes Brute force, DoS, Botnet, DDoS attacks
43 Modern reflective DDoS attacks such as DNS, NetBIOS, LDAP, MSSQL,
CICDDOS2019 150 GB 2 days Packet yes yes
UDP, UDP-Lag, SYN, NTP, DNS, SNMP, and WebDDoS
into three distinct groups: normal users, suspicious users, and potential attackers, as shown in Fig. 4. The uti-
lization of the K-means clustering method is attributed to its versatility and efficiency in handling substantial
volumes of data promptly. The algorithm’s effectiveness lies in its ability to efficiently group network traffic
data points into clusters, making it suitable for the rapid and real-time identification of user behavior patterns.
However, a key challenge in employing K-means is the selection of an appropriate value for "k", representing the
number of clusters. Selecting an unsuitable "k" value can lead to suboptimal clustering outcomes.
To address this challenge, we have opted for a value of "k" equal to 3, aligning with the need to classify users
into three primary clusters. This decision ensures that the K-means algorithm effectively captures the nuances
of user behavior and facilitates the differentiation of normal users, suspicious users, and potential attackers
within the system.
After clustering the users into the three clusters, the evaluation process proceeds to test each user (uj ) within
the suspicious and potential attackers’ clusters against the normal users’ cluster. This examination aims to assess
the impact of each user on the entropy of the normal users’ cluster, as expressed in the following equation. Within
the examined clusters, distinct thresholds are established for the entropy ratio change, denoted as Suspicious
Cluster Entropy Change Delta (δsusp) and Attacker Cluster Entropy Change Delta (δattack ). The evaluation crite-
rion involves comparing the ratioi (h) with the corresponding δi for each cluster. If the ratio change is less than
the predetermined threshold δi (ratioi (h) < corresponding, δi ), the user is identified as a potential attacker. This
demanding assessment process ensures a robust and accurate identification of suspicious and potential attacker
users within the system.
H(X ∪ ui )
ratioi (h) =
H(X)
Training parameters
The system includes a variety of parameters (data attributes) that manage its interaction with incoming request
flows. These parameters collaboratively influence the system’s performance, serving a crucial role in guaranteeing
optimal functionality and responsiveness. A summary of these parameters is described in Table 1:
In this study, we focused on four main parameters, namely interval size, βlower , δsusp, and δattack . From self-
assessment on CIC-IDS201741, the four parameters presented similar behavior concerning system accuracy.
Initially, accuracy starts increasing with the increase of the respected parameter until it reaches its peak and then
starts decreasing with the continuousincrease in the parameter value. This pattern mirrors the characteristic
curve observed in normal distribution patterns.
Evaluation scheme
This section provides a comprehensive overview of the evaluation environment employed in this study. It includes
an extensive exploration of the datasets utilized. To ensure a comprehensive and open discussion about the
results, it also specifies the exact evaluation criteria that will be used to evaluate the effectiveness of the sug-
gested approaches.
Datasets description
CIC-IDS2017 dataset has been curated by the Canadian Institute for Cybersecurity (UNB) to accurately describe
network activities and simulate attack scenarios based on real security reports. The dataset offers a comprehensive
range of network behaviors for analysis. It includes various protocols like Hypertext Transfer Protocol (HTTP),
Secure Shell Protocol (SSH), File Transfer Protocol (FTP), Hop-by-Hop IPv6 (HOPOPT), Transmission Control
Protocol (TCP), and User Datagram Protocol (UDP). The refined data preserves key characteristics like Source
IP, Destination IP, Source Port, Protocol, Timestamp, and Label and simplifies it for efficient model e valuation41.
CSECIC2018 dataset is a crucial tool for cybersecurity researchers, providing network traffic data for intrusion
and DDoS detection models and algorithms. Like CIC-IDS2017, this dataset covers various protocols, offering
a comprehensive view of network behaviors, including HTTP DDoS a ttacks42.
CICDDoS2019 analyzes DDoS attacks, including both benign and common ones. The dataset enables the
analysis of network traffic and the extraction of over 80 traffic features. It features labeled flows categorized by
timestamp, source and destination IPs, source and destination ports, protocols, and attack t ypes43.
Table 2 provides a summary of the three datasets. The model assessment was carried out on CIC-IDS201741
dataset.
Vol:.(1234567890)
www.nature.com/scientificreports/
Evaluation metrics
To conduct a comprehensive performance evaluation, it is essential to consider a variety of metrics. The confusion
matrix is used to evaluate the performance of the model. The performance assessment is done through four main
measures as follows: true positive (TP), true negative (TN), false positive (FP), and false negative (FN). Where
the positive indicates being an attacker. The system accuracy is calculated as the number of correct labeling as
shown in the subsequent equation.
TP + TN
Accuracy =
TP + TN + FP + FN
In our model, accuracy (acc) and the other confusion metrics are calculated for the mitigation effect, in other
words, if the entropy detected that the user is an attacker, the TP value will be affected in the next interval since
all the requests in the current interval were handled (i.e., FN).
In addition to the accuracy, false positive rate (FPR), Precision, Recall, and F1-Score are calculated for the
technique as the following. Recall (sensitivity) measures the percentage of attackers that are correctly detected.
Additionally, F1-score symmetrically represents both recall and precision.
FP
FPR =
FP + TN
Vol.:(0123456789)
www.nature.com/scientificreports/
TP
Precision prec =
TP + FP
TP
Recall(recall) =
TP + FN
2 × prec × recall
F1Score =
prec + recall
One of the main challenges when utilizing datasets for training and testing machine learning models, includ-
ing those considered for evaluating the proposed approach, is data imbalance. Data imbalance can cause machine
learning algorithms to prioritize accuracy for the majority class while neglecting the minority class, resulting in
biased models. To make sure that the model can effectively identify patterns from all classes, dataset imbalance
must be addressed. Dataset dimension reduction can help address data imbalance by mitigating the impact of the
majority class dominating the feature space44. Performance measures such as Matthews Correlation Coefficient
(MCC), and Geometric Mean (G-means) are frequently used to evaluate how well machine learning models
perform, especially when dealing with imbalanced datasets. MCC considers true and false positives and negatives
and ranges from − 1 to 1. A coefficient of 1 represents a perfect prediction, 0 represents a random prediction,
and − 1 indicates total disagreement between prediction and observation.
(1) FP = TP = 0.
(1) FP = TP = 0.
(2) FPR increases due to an increase in FP% (from 0% to
(2) The F1-Score increases due to an increase in FP% (from 0% to
0.0000988%) and TP% (from 0% to 27.7676056 %).
0.0000988%) and TP% (from 0% to 27.7676056 %).
(3) No remarkable effect on FPR.
(3) No remarkable effect on F1-Score.
(b) Attacker Delta and FPR%. (c) Attacker Delta and F1-Score.
Vol:.(1234567890)
www.nature.com/scientificreports/
On the other hand, G-means evaluates the classifier’s performance by considering both sensitivity and speci-
ficity. Sensitivity measures the classifier’s accuracy in detecting stuck-up events, while specificity evaluates its abil-
ity to identify non-stuck-up events, providing a balanced assessment of classification a ccuracy45. These metrics
provide a more comprehensive assessment of the model’s performance, enabling more accurate decision-making
in real-world applications.
TP × TN − FP × FN
MCC = √
(TP + FP)(TP + FN)(TN + FP)(TN + FN)
TP TN
G − means = sensitivity × specificity = ×
TP + FN TN + FP
Results
This study conducts a series of experiments, including both self-assessment and comparative analyses. In the
beginning, a single-protocol dataset generator was implemented, to be as an initial phase to test the sensitivity
of the system parameters. After that, the system parameters go through comprehensive testing against the CIC-
IDS201741 dataset. Finally, the system’s performance was thoroughly assessed against multiple datasets, includ-
ing CIC-IDS201741, CSECIC201842, and CIC-DDoS201943. The system parameters’ sensitivity was used to plan
the training approach used later. The rest of this section illustrates the results of both the self-assessment and
comparative analyses, detailing the various phases of our methodology and the outcomes observed at each step.
Vol.:(0123456789)
www.nature.com/scientificreports/
Model assessment
The proposed model contains multiple parameters that control its results, as shown in "Methods" section. In this
work, we focused on the following parameters.
• Interval Size
• Entropy Lower Threshold (βlower )
• Possible Attackers entropy Delta (δattack )
• Suspicious Users entropy delta (δsusp )
In the beginning, the model is trained at a selected interval size (mostly 120-s intervals). After that, each
parameter has a value range depending on its type, for the entropy lower threshold, entropy deltas [0, 1].
The experiments reveal distinct phases in the behavior of the key parameters βlower , δattack , and δsusp, each
affecting the system’s accuracy and false positive rate (FPR) differently. For instance, varying βlower shows three
distinct phases, with the system transitioning from low accuracy to high accuracy as it is adapted to different
entropy ranges. Similarly, δattack shows three phases, with its impact on accuracy being less significant compared
to βlower , but still contributing to overall system performance.
CIC‑IDS2017 self‑assessment
A detailed self-assessment of our model is conducted using CIC-IDS201741 dataset. Experiments measure the
impact of each attribute βlower , δattack , and δsusp on metrics such as accuracy, FPR, and F1-Score. Figure 5a shows
the results of changing the value of βlower where the accuracy is the lowest with FPR = 0 since the entropy detec-
tion alarm is never fired and thus no additional checks are carried out. Then, the accuracy starts to increase
significantly after βlower = 0.2, and this can be justified by Fig. 3 where the entropy of all attacking intervals
H(U) > 0.25, so βlower< 0.25 won’t cause any checks. After βlower= 0.2, with the increase in βlower value, the system
starts inspecting more intervals and detecting more attackers, which increases the TP rate and accuracy till it
Vol:.(1234567890)
www.nature.com/scientificreports/
reaches a certain limit. With the continuous increase of βlower , the system starts considering normal intervals as
under-attack intervals, and thus FP rate increases, and accuracy decreases due to considering high-rate normal
users as attackers.
Regarding FPR, as shown in Fig. 5b, the curve can be divided into three phases: very low FPR with βlower< 0.2
where FP = TP = 0, rapidly increased accuracy with rapid FP and FPR, and a remarkably increased FP affecting
accuracy, resulting in higher FPR. Figure 5c shows that the F1-Score has observations as in accuracy.
Overall, the effect of βlower has three phases. The first phase, where βlower < H(U), H(U) is the minimum
attacking interval’s entropy, is characterized by low FP and TP rates, thus a lower accuracy and F1-Score since
the module doesn’t consider any interval suspicious. In the second phase, where βlower covers the range of suspi-
cious intervals, causing an increase in accuracy due to the increase in TP. Finally, the third phase where βlower
covers the normal intervals’ entropy ranges, where the accuracy slightly decreases due to considering high-rate
users as attackers, i.e., FP increases.
The study demonstrates a similar effect on the δattack in three phases, as shown in Fig. 6. Figure 6a shows that
at lower values, accuracy is very low since most of the attackers fall in the possible attackers’ cluster. As δattack
increase, more attackers are detected, increasing TP, FP, and accuracy. However, the impact of the δattack on system
accuracy is less remarkable compared to the effect of βlower value. In the third phase, increasing δattack doesn’t
significantly impact accuracy due to the higher likelihood of the system being attacked.
The analysis shows nuanced relationships between the attributes (βlower , δattack , and δsusp) and system perfor-
mance. For example, lower values of δattack lead to lower accuracy due to missed detections of attackers, while
higher values increase false positives. These observed patterns help in understanding the behavior of the model
under different conditions, guiding parameter tuning and training approaches for improved performance.
Overall, the effect of δattack can be divided into three phases. In the first phase, at low δattack values, the
procedure will not detect attackers, i.e. very low TP, and thus very low accuracy and high FN. Followed by the
second phase, where accuracy reaches a peak due to high TP and low FN rates. Finally, at high δattack rates, the
FP starts increasing due to considering high-rate normal users as attackers, causing a slight decrease in accuracy
and F1-score.
Finally, Fig. 7 shows the effect of δsusp . There is no remarkable effect on the system. It seems that no true
attackers fall in the suspicious cluster, so the system has lower accuracy with lower δsusp as shown in Fig. 7a. Only
at higher values of δsusp, the system starts to falsely mark normal users as attackers, increasing FP and FPR and
decreasing the system’s accuracy and F1-Score.
The δsusp was expected to have a similar pattern as the δattacker , yet it showed no significant effect on the system,
except at very high values, where it decreased accuracy due to an increase in FP. With further investigations, we
found no attackers falling into this cluster in our datasets. Yet, it’s safer to consider checking the suspicious cluster
to detect cases where there exists a very high-rate attacker that will change the cluster of lower-rate attackers
resulting in some attackers falling into the suspicious cluster.
Benchmarking
The proposed system is evaluated using three distinct datasets: C ICIDS2017 41, C SECIC2018 42, and
CICDDOS201943. Performance metrics such as accuracy, FPR, TPR, F1-Score, G-means, and MCC are used for
comparison. The results of the evaluation are presented in Table 3. For each dataset, the system performance is
compared to multiple techniques. The results will be discussed in the remainder of this subsection.
In the context of the CICIDS201741 dataset, the performance of the system is compared against other notable
techniques, including Lucid29, Multi-layer Perception (MLP), 1D-CNN46, LSTM46, and the combined approach
of 1D-CNN + LSTM46. This comparative analysis aims to provide a thorough understanding of how the proposed
system fares against established methods, shedding light on its strengths and effectiveness in the specific context
of the C ICIDS201741 dataset. The ensuing discussion will clarify these comparative results as shown in Table 4.
Roopak et al.46 proposed four different DL models for DDoS attack detection in Internet of Things (IoT) net-
works. The models are built with combinations of LSTM, CNN, and fully connected layers. The input layer of all
the models consists of 82 units, one for each flow-level feature available in CICIDS201741, while the output layer
returns the probability of a given flow being part of a DDoS attack. The model intelligent detection convolutional
neural network (ID-CNN) + LSTM46 produces good classification scores, while the others seem to suffer from
high FN rates. The same observations are found on the C ICIDS201741 dataset, the increase in FN and decrease
in FP leads to an increase in accuracy compared to L ucid29, and eventually, the other techniques used. Then, the
CSECIC201842 dataset results were compared to the Lucid29 technique as shown in Table 5.
The CSECIC201842 showed different behavior than the other two datasets, the FP increased, and the FN
decreased, but overall, the system has a simple improvement in accuracy due to the L ucid29 technique. The
dataset had a flow with a high rate compared to other flows which caused this increase in FP rate since this flow
was falsely detected as an attack.
ICDDOS201943 dataset results were compared to Cil et al.31 and Alghazzawi et al.47 techniques
Finally, the C
as shown in Table 6. The proposed system shows higher accuracy rates compared to Cil et al.31 and Alghazzawi
et al.47. The system showed higher FP rates compared to C ICIDS201741.
The proposed model consistently outperforms existing techniques across all datasets, demonstrating superior
accuracy and robustness in DDoS attack detection. Comparative analysis with techniques such as Lucid29, MLP,
and 1D-CNN46 highlight the strengths of our model, particularly in mitigating false positives and achieving
high accuracy rates.
This study contributes to the advancement of DDoS detection methods by providing a comprehensive analysis
of key system parameters and their impact on performance. The research outcomes establish the foundation
Vol.:(0123456789)
www.nature.com/scientificreports/
for future investigations that aim to enhance the efficiency and effectiveness of DDoS detection systems in real-
world scenarios.
1. The hybrid methodology proposed in this study effectively integrates statistical and machine-learning cluster-
ing techniques, offering a robust solution for detecting and mitigating DDoS attacks in SDN environments.
2. Integration of entropy-based alerting with clustering analysis empowers the hybrid method to effectively
address sudden and rapidly initiated DDoS attacks, enhancing the system’s resilience against such attacks.
3. Integration of an entropy-based alerting and detection mechanism enables real-time monitoring of incoming
requests, enabling rapid detection and mitigation of potential attacks and enhancing system responsiveness.
Once an attack is suspected, the active users are clustered based on their request rates and filtered based on
their influence on the system’s entropy to quickly identify potential attackers.
4. Utilization of the K-means clustering algorithm provides an efficient means of analyzing the influence of
active users on the system’s entropy, contributing to improved defense mechanisms against DDoS attacks.
Future research directions focusing on enhancing detection accuracy, exploring alternative methodologies,
and addressing system optimization challenges can further strengthen the resilience of SDN networks against
DDoS attacks. The future scope and implications of this study could include:
a. Exploring the incorporation of more advanced machine learning algorithms to enhance the accuracy of
users’ classification based on their behavior (request types) within the system.
b. Evaluation of alternative clustering methods beyond K-means could provide valuable insights into optimizing
system performance and scalability, providing a more comprehensive perspective on clustering techniques
suitable for DDoS attack detection in SDN environments.
c. Implementation of trust-management mechanisms for mitigating DDoS attacks could provide additional
layers of security to the system. This method allows users to accumulate a trust score during their interaction
with the system, which can be used to verify their legitimacy. If any abusive behavior occurs, the trust score
will decrease, and if it falls below a certain threshold, the user will be blocked from further interactions with
the system. Applying this approach could enhance overall security measures.
d. Optimization efforts should be directed towards mitigating the time and space complexity introduced by the
entropy detection mechanism. Effective system performance depends on optimizing the entropy detection
procedure, which guarantees low computational overhead and resource consumption for smooth functioning
in real-world SDN
Data availability
The datasets used and analyzed during the current study are publicly available as the following: CICIDS2017
dataset is available at https://www.unb.ca/cic/datasets/ids-2017.html, CSECIC2018 dataset is available at https://
www.unb.ca/cic/datasets/ids-2018.html, and CICDDOS2019 dataset is available at https://www.unb.ca/cic/datas
ets/ddos-2019.html.
References
1. Shah, S., Bae, S., Jaikar, A. & Noh, S.-Y. An adaptive load monitoring solution for logically centralized SDN controller. 2016 18th
Asia-Pacific Network Operations and Management Symposium (APNOMS) 1–6. https://doi.org/10.1109/APNOMS.2016.7737207
(2016).
2. Chakraborty, S., Kumar, P. & Sinha, B. A study on DDOS attacks, danger and its prevention. Int. J. Res. Anal. Rev 6(2), 10–15.
https://doi.org/10.1729/Journal.20847 (2019).
3. Wang, J., Wang, L. & Wang, R. A method of DDoS attack detection and mitigation for the comprehensive coordinated protection
of SDN controllers. Entropy 25, 1210. https://doi.org/10.3390/e25081210 (2023).
4. Salunke, K. & Ragavendran, U. Shield techniques for application layer DDoS attack in MANET: A methodological review. Wirel.
Pers. Commun. 120, 2773–2799. https://doi.org/10.1007/s11277-021-08556-3 (2021).
5. Valdovinos, I. A., Pérez-Díaz, J. A., Choo, K.-K.R. & Botero, J. F. Emerging DDoS attack detection and mitigation strategies in
software-defined networks: Taxonomy, challenges and future directions. J. Netw. Comput. Appl. 187, 103093. https://doi.org/10.
1016/j.jnca.2021.103093 (2021).
6. Adedeji, K. B., Abu-Mahfouz, A. M. & Kurien, A. M. DDoS attack and detection methods in internet-enabled networks: Concept,
research perspectives, and challenges. JSAN 12, 51 (2023).
7. Saied, M., Guirguis, S. & Madbouly, M. A comparative analysis of using ensemble trees for botnet detection and classification in
IoT. Sci. Rep. 13, 21632. https://doi.org/10.1038/s41598-023-48681-6 (2023).
8. Saied, M., Guirguis, S. & Madbouly, M. Review of artificial intelligence for enhancing intrusion detection in the internet of things.
Eng. Appl. Artif. Intell. 127, 107231. https://doi.org/10.1016/j.engappai.2023.107231 (2024).
9. Ullah, S., Mahmood, Z., Ali, N., Ahmad, T. & Buriro, A. Machine learning-based dynamic attribute selection technique for DDoS
attack classification in IoT networks. Computers 12, 1156 (2023).
Vol:.(1234567890)
www.nature.com/scientificreports/
10. Chopra, A., Behal, S. & Sharma, V. Evaluating Machine Learning Algorithms to Detect and Classify DDoS Attacks in IoT. In 2021
8th International Conference on Computing for Sustainable Global Development (INDIACom) 517–521.
11. Aljanabi, Y. I., Majeed, A. A., Jihad, K. H. & Qader, B. A. Detect and mitigate blockchain-based DDoS attacks using machine
learning and smart contracts. Informatica https://doi.org/10.31449/inf.v46i7.4033 (2022).
12. Conti, M., Kumar, E. S., Lal, C. & Ruj, S. A survey on security and privacy issues of bitcoin. Commun. Surveys Tuts. 20, 3416–3452.
https://doi.org/10.1109/comst.2018.2842460 (2018).
13. Saied, M., Adjogble, F., Guirguis, S., Hemmji, M. & Warschat, J. A Framework for systematic scientific research management. In
2023 Portland International Conference on Management of Engineering and Technology (PICMET) 1–16.
14. Koay, A., Chen, A., Welch, I. & Seah, W. K. G. A new multi classifier system using entropy-based features in DDoS attack detection.
In 2018 International Conference on Information Networking (ICOIN) 162–167.
15. Tsobdjou, L. D., Pierre, S. & Quintero, A. An online entropy-based DDoS flooding attack detection system with dynamic threshold.
IEEE Trans. Netw. Service Manag. 19, 1679–1689. https://doi.org/10.1109/tnsm.2022.3142254 (2022).
16. Sahoo, K. S. et al. An early detection of low rate DDoS attack to SDN based data center networks using information distance
metrics. Future Gener. Comput. Syst. 89, 685–697. https://doi.org/10.1016/j.future.2018.07.017 (2018).
17. Pérez-Díaz, J. A., Valdovinos, I. A., Choo, K. K. R. & Zhu, D. A flexible SDN-based architecture for identifying and mitigating
low-rate DDoS attacks using machine learning. IEEE Access 8, 155859–155872. https://doi.org/10.1109/ACCESS.2020.3019330
(2020).
18. Ali, T. E., Chong, Y.-W. & Manickam, S. Machine learning techniques to detect a DDoS attack in SDN: A systematic review. Appl.
Sci. 13, 3183 (2023).
19. Li, D., Yu, C., Zhou, Q. & Yu, J. Using SVM to detect DDoS attack in SDN network. IOP Conf. Ser. Mater. Sci. Eng. 466, 012003.
https://doi.org/10.1088/1757-899X/466/1/012003 (2018).
20. Ye, J., Cheng, X., Zhu, J., Feng, L. & Song, L. A DDoS attack detection method based on SVM in software defined network. Secur.
Commun. Netw. 2018, 9804061. https://doi.org/10.1155/2018/9804061 (2018).
21. Cui, J., Wang, M., Luo, Y. & Zhong, H. DDoS detection and defense mechanism based on cognitive-inspired computing in SDN.
Future Gener. Comput. Syst. 97, 275–283. https://doi.org/10.1016/j.future.2019.02.037 (2019).
22. Hannache, O. & Batouche, M. C. Neural network-based approach for detection and mitigation of DDoS attacks in SDN environ-
ments. Int. J. Inf. Secur. Priv. (IJISP) 14, 50–71 (2020).
23. Cui, J., Zhang, J., He, J., Zhong, H. & Lu, Y. DDoS detection and defense mechanism for SDN controllers with K-Means. In 2020
IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC) 394–401.
24. Gu, Y., Li, K., Guo, Z. & Wang, Y. Semi-supervised K-means DDoS detection method using hybrid feature selection algorithm.
IEEE Access 7, 64351–64365 (2019).
25. Gadze, J. D., Bamfo-Asante, A. A., Agyemang, J. O., Nunoo-Mensah, H. & Opare, K.A.-B. An investigation into the application of
deep learning in the detection and mitigation of DDOS attack on SDN controllers. Technologies 9, 14 (2021).
26. Li, C. et al. Detection and defense of DDoS attack-based on deep learning in OpenFlow-based SDN. Int. J. Commun. Syst. 31,
e3497 (2018).
27. Makuvaza, A., Jat, D. S. & Gamundani, A. M. Deep neural network (DNN) solution for real-time detection of distributed denial
of service (DDoS) attacks in software defined networks (SDNs). SN Comput. Sci. https://doi.org/10.1007/s42979-021-00467-16
(2021).
28. Nugraha, B. & Murthy, R. N. Deep learning-based slow DDoS attack detection in SDN-based networks. In 2020 IEEE Conference
on Network Function Virtualization and Software Defined Networks (NFV-SDN). 51–56.
29. Doriguzzi-Corin, R., Millar, S., Scott-Hayward, S., Martinez-del-Rincon, J. & Siracusa, D. LUCID: A practical, lightweight deep
learning solution for DDoS attack detection. IEEE Trans. Netw. Service Manag. 17, 876–889 (2020).
30. Liang, X. & Znati, T. A long short-term memory enabled framework for DDoS detection. In 2019 IEEE Global Communications
Conference (GLOBECOM). 1–6.
31. Cil, A. E., Yildiz, K. & Buldu, A. Detection of DDoS attacks with feed forward based deep neural network model. Expert Syst. Appl.
169, 114520 (2021).
32. Khan, I. A. et al. A new explainable deep learning framework for cyber threat discovery in industrial IoT networks. IEEE Int. Things
J. 9, 11604–11613. https://doi.org/10.1109/JIOT.2021.3130156 (2022).
33. Khan, I. A. et al. Enhancing IIoT networks protection: A robust security model for attack detection in internet industrial control
systems. Ad Hoc Netw. 134, 102930. https://doi.org/10.1016/j.adhoc.2022.102930 (2022).
34. Khan, I. A. et al. A privacy-conserving framework based intrusion detection method for detecting and recognizing malicious
behaviours in cyber-physical power networks. Appl. Intell. 51, 7306–7321. https://doi.org/10.1007/s10489-021-02222-8 (2021).
35. Khan, I. A. et al. Federated-SRUs: A federated-simple-recurrent-units-based IDS for accurate detection of cyber attacks against
IoT-augmented industrial control systems. IEEE Int. Things J. 10, 8467–8476. https://doi.org/10.1109/JIOT.2022.3200048 (2023).
36. Wani, S. et al. Distributed denial of service (DDoS) mitigation using blockchain—A comprehensive insight. Symmetry 13, 227
(2021).
37. Bose, A., Aujla, G. S., Singh, M., Kumar, N. & Cao, H. Blockchain as a service for software defined networks: a denial of service
attack perspective. In 2019 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence
and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/
CBDCom/CyberSciTech) 901–906.
38. Boussard, M., Papillon, S., Peloso, P., Signorini, M. & Waisbard, E. STewARD:SDN and blockchain-based trust evaluation for
automated risk management on IoT devices. In IEEE INFOCOM 2019 - IEEE Conference on Computer Communications Workshops
(INFOCOM WKSHPS) 841–846.
39. Chattaraj, D., Saha, S., Bera, B. & Das, A. K. On the design of blockchain-based access control scheme for software defined networks.
In IEEE INFOCOM 2020-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS) 237–242 (IEEE).
40. Adedeji, K. B., Abu-Mahfouz, A. M. & Kurien, A. M. DDoS attack and detection methods in internet-enabled networks: Concept,
research perspectives, and challenges. J. Sens. Actuator Netw. 12, 51 (2023).
41. Sharafaldin, I., Habibi Lashkari, A. & Ghorbani, A. A. Toward generating a new intrusion detection dataset and intrusion traffic
characterization. In International Conference on Information Systems Security and Privacy.
42. University of New Brunswick, C. I. f. C. CSE-CIC-IDS2018 on AWS, https://www.unb.ca/cic/datasets/ids-2018.html (2021).
43. Sharafaldin, I., Habibi Lashkari, A., Hakak, S. & Ghorbani, A. A. Developing realistic distributed denial of service (DDoS) attack
dataset and taxonomy. In 2019 International Carnahan Conference on Security Technology (ICCST), 1–8 (2019).
44. Prasad, A. & Chandra, S. VMFCVD: An optimized framework to combat volumetric DDoS attacks using machine learning. Arab.
J. Sci. Eng. 47, 9965–9983. https://doi.org/10.1007/s13369-021-06484-9 (2022).
45. Tewari, S. & Dwivedi, U. D. A real-world investigation of TwinSVM for the classification of petroleum drilling data. In 2019 IEEE
Region 10 Symposium (TENSYMP). 90–95.
46. Roopak, M., Tian, G.-Y. & Chambers, J. A. Deep learning models for cyber security in IoT networks. In 2019 IEEE 9th Annual
Computing and Communication Workshop and Conference (CCWC), 0452–0457 (2019).
47. Alghazzawi, D., Bamasag, O., Ullah, H. & Asghar, M. Z. Efficient detection of DDoS attacks using a hybrid deep learning model
with improved feature selection. Appl. Sci. 11, 11634 (2021).
Vol.:(0123456789)
www.nature.com/scientificreports/
Acknowledgements
The authors would like to thank Mohamed Saied Essa for his constructive comments and proofreading that
greatly improved this manuscript.
Author contributions
The authors contributed as follows: A.I. is the corresponding author. A.I., S.K. and E.A. performed the concep-
tualization. A.I., S.K. and E.A. determined the research methodology. A.I. and S.K. developed the Software. A.I.
and E.A. validated the results. S.K. and E.A. surveyed the related work. A.I. and E.A. collected the data. A.I. and
E.A. analyzed and interpreted the data. A.I. and S.K. prepared and wrote the original draft. A.I. and E.A. wrote
the reviews and edits. A.I. and E.A. designed the visualization. S.K. and E.A. supervised the whole research. All
authors have read and agreed to the published version of the manuscript.
Funding
Open access funding is provided by The Science, Technology & Innovation Funding Authority (STDF) in coop-
eration with The Egyptian Knowledge Bank (EKB). Funds or other support was received.
Competing interests
The authors declare no competing interests.
Additional information
Correspondence and requests for materials should be addressed to A.I.H.
Reprints and permissions information is available at www.nature.com/reprints.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and
indicate if changes were made. The images or other third party material in this article are included in the article’s
Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included
in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or
exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy
of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Vol:.(1234567890)