Hybrid Intrusion Detection Systems Based Mean-Variance Mapping Optimization Algorithm and Random Search
Hybrid Intrusion Detection Systems Based Mean-Variance Mapping Optimization Algorithm and Random Search
552
Adil L. Albukhnefis1* Amar A. Sakran2 AtaAllah Saleh Mahe3 Maryam Imran Mousa4
Ahmed Mohsin Mahdi1 Aqeel hamza al-fatlawi 5
1
College of Computer Science and Information Technology, University of Al-Qadisiyah, Iraq
2
College of Biomedical Informatics, University of information technology and Communication, Iraq
3
Ministry of Industry and Minerals, Iraq
4
Ministry of Education, Al-Qadisiyah Education Directorate, Iraq
5
Department of Computer Techniques, Imam Kadhum College, Iraq
* Corresponding author's Email: adil.lateef@qu.edu.iq
Abstract: Intrusion detection systems are critical in identifying and mitigating cyber threats. However, the intrusion
data often contains insufficient features, which can adversely affect the classification accuracy of machine learning
algorithms. To effectively select optimal features from intrusion attacks, a highly efficient model is required to extract
highly correlated features. Nevertheless, traditional detection systems may experience low accuracy and high false-
positive rates. This paper proposes a hybrid model for improving intrusion detection systems using mean-variance
mapping optimization (MVMO) and random search (MVMOR). The strategy of the proposed model is MVMO
algorithm is used to search for the optimal feature subset. while, the random search is employed to optimize the
hyperparameters of the machine learning algorithm (classifier). The proposed hybrid model seeks to optimize the
feature selection and parameters of classifiers at some time. The performance of the hybrid model is evaluated on the
NSL-KDD as a benchmark dataset. The proposed MVMOR achieves an accuracy of 88%, while the conventional
MVMO achieved only an 80% accuracy rate. The empirical results show the proposed model has the potential to offer
better protection against various cyber threats, thus making a valuable contribution to the field of Cybersecurity.
Keywords: Intrusion detection systems, Optimize machine learning algorithms, Feature selection, Mean-variance
mapping optimization (MVMO), Cybersecurity.
• Contributions
Contributions highlight this research's innovative
techniques and methodologies, leading to significant
progress in intrusion detection systems. Several
noteworthy contributions are presented in the
proposed model:
1. It introduces MVMOR, a novel hybrid model
combining mean-variance mapping optimization
Figure. 1 The scenario of applying the IDS in networks
(MVMO) with random search. This hybrid
approach aims to improve the performance of
mapping function for the mutation operation based on
intrusion detection systems.
two critical statistical functions: the mean and the
2. Using metaheuristic optimization algorithms to
variance of the set of best solutions. Developed
refine the proposed solutions and effectively
initially for meta-heuristic searches in the continuous
identify the critical features for network attribute
domain, MVMO is limited to a discrete interval of [0,
analysis. This process ensures comprehensive and
1] when selecting features via binary search. It is
accurate network attribute evaluation.
essential to maintain high diversity to avoid
3. The proposed methodology achieves better
stagnation or sliding into local optima, as noted in [16,
performance and overall system optimization by
17]. The hybrid algorithm has the potential to support
optimizing feature selection and classifier
frequent jumps toward the global solution, improving
parameters simultaneously. By considering these
the overall progress of the search. The intrusion
two aspects together, the method ensures
detection system (IDS) analyses the frequency and
comprehensive optimization and improved
nature of attacks, and organizations can use this data
performance.
to improve their security measures or implement
more efficient controls [18]. In addition, an intrusion
• Paper organization
detection system can help organizations identify
The remainder of this paper is organized as
network device configuration issues or errors. Fig. 1
follows. Section 2 presents related work on pre-
shows the scenario of applying the IDS in networks.
processing, verifying, and identifying hand veins.
Integrating metaheuristic algorithms and wrapper
The proposed model and its components are
models into intrusion detection systems provide an
discussed in sections 3. and 4, which introduce the
intelligent and adaptive approach to cyber threat
principles of MVMO and the wrapper model,
detection and mitigation. It provides improved
respectively. The experimental results and analysis
accuracy, adaptability, scalability, and automation,
are discussed in section 4. Finally, the conclusions of
overcoming the limitations of traditional rule-based
this work and future improvements are presented in
methods and improving the overall security of
section 6.
networks and systems. This paper proposes a hybrid
model that combines mean-variance mapping
optimization (MVMO) and random search
2. Related works
(MVMOR) to enhance intrusion detection systems. Numerous studies have suggested optimizing trait
The MVMO algorithm is used in the proposed selection (F.S.). However, the majority of these
model's search strategy to find the ideal feature subset, studies have utilized standalone metaheuristic
and random search is used to optimize the machine algorithms, such as particle swarm optimization
learning algorithm's (classifier's) hyperparameters. (PSO), gray wolf optimizer (GWO), and bat search
The proposed hybrid model, MVMOR, combines algorithm (B.A.), without improving the search
mean-variance mapping optimization (MVMO) and operations within these algorithms. The beneficial
random search to improve intrusion detection models when seeking to optimize the feature
systems. The model uses meta-heuristic optimization selection and parameters of classifiers to improve the
algorithms to optimize the proposed solutions and accuracy of intrusion detection systems
identify the most important features for network In their study, Almasoudy et al. [18] introduced a
International Journal of Intelligent Engineering and Systems, Vol.16, No.5, 2023 DOI: 10.22266/ijies2023.1031.47
Received: May 23, 2023. Revised: August 1, 2023. 554
hybrid model based on bat algorithm (B.A.) for 3. Mean-variance mapping optimization
optimizing support vector machine (SVM) (MVMO)
parameters and selecting optimal features. The model
utilized a pool of bat vectors, where the first two MVMO is a type of population-based stochastic
positions were assigned to SVM parameters, and the optimization algorithm, a novel optimization
rest of the vector represented the feature selection approach [8, 14, 15]. Like other stochastic
mask. The proposed model's reliance on parameter optimization methods, MVMO employs evolutionary
adjustment based on gradients introduces limitations operations such as selection, crossover, and
in terms of the search space for the algorithm. mutation[20]. However, what sets MVMO apart is
Additionally, using a wrapper method for feature that it constrains the search space of all optimization
selection does not guarantee a reduction in the variables and the output of internal operations
number of selected features. Consequently, the between the values of [0,1]. Additionally, MVMO
performance of the SVM algorithm may suffer due to applies a mapping function as a mutation operation
its inefficiency in handling high-dimensional on the offspring generated through the crossover.
problems. This mapping function is calculated based on the
In another study by the author [12], two mean and variance of the n-best solutions and is used
metaheuristic algorithms, namely particle swarm to optimize the offspring further. Eqs. (1) to (5)
optimization (PSO) and bat search algorithm (BSA), illustrate the mathematical formals are used to find
were proposed to search for the best solutions the new offspring.
individually, which were shared between the two 1
algorithms. However, the proposed model suffered 𝑥𝑗′ = ∑𝑛𝑖=1 𝑥𝑖 (1)
𝑛
from stagnation and failed to enhance or reduce the
local optima problem. Where: 𝑥𝑖′ the mean value of offspring jth, n
In [19] the authors proposed a method in the field dimension of offspring, and j the sequence number.
of feature selection. It combines the Naive Bayes
classifier and the bat algorithm to select a subset of 1
𝑉𝑗 = ∑𝑛𝑖=1(𝑥𝑖 − 𝑥𝑗′ )2 (2)
features that contribute most to classification 𝑛
accuracy. However, a potential limitation of this
approach is the assumption of feature independence Where:
in Naive Bayes, which may not be present in real- The is a variance. The new population is
world scenarios where features are correlated or have generated by applying the H-function.
complex relationships. This limitation could affect
the model's ability to accurately capture feature 𝑋𝑗 = ℎ𝑥 + (1 − ℎ1 + ℎ0 ) ∗ 𝑥𝑗 − ℎ0 (3)
dependencies, potentially leading to suboptimal
feature subsets and lower performance in feature Where:
selection tasks. is offspring, h is a H-function is defined as
Taha et al. [20] utilized Naïve Bayes (N.B.) to follows.
assist B.A. in selecting optimal subgroup features.
They proposed decreasing the bat's velocity when the ℎ = 𝑥𝑗 (1 − 𝑒 −𝑥𝑠1 ) + (1 + 𝑥)𝑒 −(1−𝑥)𝑠2 (4)
difference between the past and current position is where:
negative. However, this system failed to improve the ℎ𝑥 = ℎ(𝑥 = 𝑥𝑖 ) , ℎ0 = ℎ(𝑥 = 0) , ℎ1 = ℎ(𝑥 = 1)
behaviour-based search progress, and the variety of
proposed solutions still left much to be desired when The 𝑠1 , and 𝑠2 shape variable depends on the
exploring the search process. value of 𝑆𝑖 , (i =1 or 2 ) which calculated by Eq. 5.
R. Nuiaa et al. [21] proposed the proactive feature
selection (PFS) model for detecting cyber-attacks 𝑠𝑖 = − ln(𝑣𝑗 ) 𝑓𝑠 (5)
based on subset feature selection. They introduced a
nature-inspired optimization algorithm and a
Where:
proactive feature selection threshold to optimize the
fs is function control on shapes variables.
feature selection technique. However, the proposed
model was only applied to optimize feature selection
4. Wrapper model
and not improve the overall search process.
Wrapper model feature selection is a widespread
technique in the field of machine learning used to
select the most relevant features for a given task [21].
International Journal of Intelligent Engineering and Systems, Vol.16, No.5, 2023 DOI: 10.22266/ijies2023.1031.47
Received: May 23, 2023. Revised: August 1, 2023. 555
The method involves training a model multiple times this issue, such as training the models using a subset
using various subsets of the available features and of the data or using model ensembles to reduce
evaluating each model's performance to identify the variance in the feature selection process. Despite its
most crucial features. This approach aims to drawbacks, the wrapper model remains a widely used
maximize the model's performance while minimizing and effective technique for feature selection in
the number of features used [22]. machine learning. Fig. 2 shows the principles
The process of feature selection is viewed as an wrapper model.
optimization problem. The wrapper model utilizes a
particular learning algorithm, such as a decision tree 5. Proposed MVMOR Model
or a neural network, as the wrapper to assess the
The system under consideration comprises three
model's performance on a specific task and determine
fundamental phases: Initialization of machine
which features are most significant. The wrapper
learning perimeters and preparation data, feature
model is a feature selection method with several
selection, and tuning machine learning algorithm
advantages, including its ability to handle complex
(ML) parameters. Fig. 3 illustrates the main steps and
feature interactions and non-linear relationships. It
strategy of the proposed model (MVMOR).
provides a more accurate and reliable feature
selection process than other methods, such as filters 5.1 Initialize the algorithm parameters and
or embedded feature selection techniques. The prepper dataset
method involves two primary phases: feature subset
generation and model evaluation [23-25]. The proposed approach uses hot-coding
During the feature subset generation stage, a techniques to convert sets of attributes from nominal
subset of the available features is selected for each to numeric values to perform analysis. The one-hot
model iteration, either randomly or using techniques coding method is used to convert non-numeric
such as forward or backward selection or genetic attributes to numeric attributes. The pre-processing
algorithms. The wrapper model approach is then used phase is crucial in normalizing the dataset for specific
to train the model on the subset of features that scale values within a defined range. This ensures that
performs the best after being evaluated using a bias is removed from the data set while statistical
specific performance metric, such as accuracy. properties are preserved. The data are divided into
The main drawback of the wrapper model is the training and testing to train the model. In this way,
computational cost of training multiple models with the model can be trained on the training dataset, while
various subsets of features, which can be especially the test dataset serves as a means to validate the
problematic for large datasets or complex models that model's effectiveness. It is worth noting that hot
require a lot of computational power [26]. coding can be a very effective means of converting
Different methods have been proposed to address nominal attributes into numerical attributes. The data
International Journal of Intelligent Engineering and Systems, Vol.16, No.5, 2023 DOI: 10.22266/ijies2023.1031.47
Received: May 23, 2023. Revised: August 1, 2023. 556
can be more easily analyzed and modeled, giving it also leads to a longer, more time-consuming
researchers valuable insights into network traffic's induction process. The proposed method has
underlying patterns and behaviors. The proposed similarities with the hill-climbing system in terms of
approach represents a powerful tool for conducting the underlying concept. Both methods aim to find the
sophisticated analyses of network traffic attributes, optimal solution by iteratively improving the current
which can help researchers better understand and solution. However, unlike hill-climbing, which takes
address cybersecurity threats in real-world a deterministic approach, the proposed method
environments. adopts a stochastic process in which the parameters
are randomly changed to explore the solution space.
5.2 Feature selection While random search is a popular approach for
optimizing machine learning algorithms, it has
The wrapper model uses binary metaheuristic certain limitations. For example, the method may
algorithms that restrict the search space to the binary lead to suboptimal solutions if the large parameter
interval [0,1]. However, the MVMO algorithm space and the search process become impractical. In
improves the exploration process by extending the such cases, alternative optimization techniques like
space search interval to [-1,1] to check the gradient descent or Bayesian optimization may
corresponding subset features. The user sets the provide better results. The proposed method uses a
feature selection threshold for each experiment, e.g., random search strategy to optimize the parameters of
a threshold of 0.5, in which case values less than 0.5 the ML algorithms.
are omitted. In contrast, items in the current Although this approach can produce better results,
generation are taken for values greater than or equal it may require a longer induction process. The
to the threshold. The search process is terminated similarities with the hill-climbing algorithm suggests
when the maximum iteration limit is reached [23] that iterative improvement is an effective
5.3 Tuning parameters of machine learning: optimization technique. Eq. 6 illustrates the main
formal of the random search process.
The proposed method currently uses a random
𝜕 𝑀𝑎𝑥𝑖𝑡𝑟
search strategy to assign values to the parameters of 𝑋𝑖′ = 2
+ (𝑋𝑏𝑒𝑠𝑡 − 𝑋𝑖 ) + 𝑖𝑡𝑒𝑟
𝜀 (𝑋𝑖 ) (6)
the machine learning algorithm. Adding or removing
a theta value changes the parameters, and the search Where 𝑋𝑖 is the current value, ε are random
process is repeated until the algorithm achieves better variables in the interval [-1,1], 𝑀𝑎𝑥𝑖𝑡𝑟 is maximum
results. While this approach can lead to better results, iteration, 𝑖𝑡𝑒𝑟 is the current iteration's value, and 𝑋𝑖′
International Journal of Intelligent Engineering and Systems, Vol.16, No.5, 2023 DOI: 10.22266/ijies2023.1031.47
Received: May 23, 2023. Revised: August 1, 2023. 557
Table 1. Parameters of random forest algorithm The range of each parameter is a set previously to
Parameter Range Type bind the search space.
n_estimators [10 , 1000] integer
max_depth [2 ,20] integer 7. Experiment results
min_samples_leaf [1 ,10] integer
[1, number of features Within this section, we covered the benchmark
max_features integer dataset. We evaluated the performance of four
in dataset]
random_state [1,0.42] float commonly used machine learning algorithms
(decision tree (D.T.), random forest (R.F.), naive
Bayes (N.B.), and K-NN (k-nearest neighbors)).
Table 2. Paramters of K-nn algorithm Ultimately, we compared the results obtained from
Parameter Range Type our model with those of recent studies in IDS.
[1,(Number of classes
Nearest neighbors float
in dataset+1)] 7.1 Dataset
Accuracy
Precision
F1_score
Recall
Model
precision
F1_score
Recall
Conflicts of interest
Figure. 6 The standard deviation for the best result The authors declare no conflict of interest.
obtained after running MVMO and MVMOR thirty times
Author contributions
Table 8. Comparative with other studies
Ref Dataset Accuracy The author's Contributions are as follows:
name "Conceptualization, first and second author;
[7] 82 Methodology and software, first author; validation,
[19] 81 second author; formal analysis, investigation, third
KDD_NSL
[20] 82 author; resources, fourth author; data curation, third
Test+
Proposed 88 author; writing—original draft preparation, first
MVMOR authors; writing—review and editing, visualization,
fifth authors; supervision, project administration, first
authors; funding acquisition, fifth author.
workflow, engineers can ensure their algorithms are
optimized for maximum performance and efficiency. Reference
Fig. 6 shows the evaluation regarding the low
standard division between MVMO and MVMOR. [1] A. R. Gad, A. A. Nashat, and T. M. Barkat,
"Intrusion Detection System Using Machine
7.3 Comparative with other studies: Learning for Vehicular Ad Hoc Networks Based
on ToN-IoT Dataset", IEEE Access, Vol. 9, pp.
This section compares our proposed system with 142206–142217, 2021, doi:
previous studies [7, 19, 20] conducted on the same 10.1109/ACCESS.2021.3120626.
dataset. We present the algorithm we correspond with, [2] E. Pashaei, E. Pashaei, and N. Aydin, "Gene
MVMOR, in Table 6. Additionally, Table 7 selection using hybrid binary black hole
compares our proposed MVMOR method with the algorithm and modified binary particle swarm
other two works on IDS. optimization", Genomics, Vol. 111, No. 4, pp.
669–686, 2019, doi:
8. Conclusion 10.1016/j.ygeno.2018.04.004.
The research paper proposes a hybrid model, [3] M. Z. Zakaria, S. Mutalib, S. A. Rahman, S. J.
MVMOR, combining mean-variance mapping Elias, and A. Z. Shahuddin, "Solving RFID
optimization (MVMO) and random search to mobile reader path problem with optimization
improve intrusion detection systems. This model uses algorithms", Indones. J. Electr. Eng. Comput.
meta-heuristic optimization algorithms to optimize Sci., Vol. 13, No. 3, pp. 1110–1116, 2019, doi:
the proposed solutions and identify the most relevant 10.11591/ijeecs.v13.i3.pp1110-1116.
features for network attribute analysis. The wrapper [4] Y. Yang, Y. Wu, H. Yuan, M. Khishe, and M.
model is a feature selection method. It is used to Mohammadi, "Nodes clustering and multi-hop
select relevant features, and hot-coding techniques routing protocol optimization using hybrid
are applied to convert non-numeric attributes into chimp optimization and hunger games search
numeric ones. The proposed method uses a random algorithms for sustainable energy efficient
search strategy to optimize the parameters of the underwater wireless sensor networks", Sustain.
machine learning algorithm, which leads to better Comput. Informatics Syst., Vol. 35, p. 100731,
results but requires a longer induction process. The 2022.
random forest algorithm gave the best results when [5] A. Bhattacharyya, R. Chakraborty, S. Saha, S.
used with the proposed system, indicating that it is Sen, R. Sarkar, and K. Roy, "A Two-Stage Deep
International Journal of Intelligent Engineering and Systems, Vol.16, No.5, 2023 DOI: 10.22266/ijies2023.1031.47
Received: May 23, 2023. Revised: August 1, 2023. 560
International Journal of Intelligent Engineering and Systems, Vol.16, No.5, 2023 DOI: 10.22266/ijies2023.1031.47