0% found this document useful (0 votes)
66 views5 pages

K Nearest Neighbor Based Model For Intrusion Detection System

This document presents a K nearest neighbor (KNN) based model for intrusion detection. The model is evaluated on the ISCX dataset. KNN is a simple machine learning algorithm that can be used for both classification and regression problems. It works by finding the k closest training examples in the feature space and assigning the test instance the most common class among its k nearest neighbors. The proposed KNN model achieved an improved accuracy of 99.96% for intrusion detection, outperforming other models. Experimental results demonstrate the effectiveness of the KNN approach for this application.

Uploaded by

Lovepreet Kaur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views5 pages

K Nearest Neighbor Based Model For Intrusion Detection System

This document presents a K nearest neighbor (KNN) based model for intrusion detection. The model is evaluated on the ISCX dataset. KNN is a simple machine learning algorithm that can be used for both classification and regression problems. It works by finding the k closest training examples in the feature space and assigning the test instance the most common class among its k nearest neighbors. The proposed KNN model achieved an improved accuracy of 99.96% for intrusion detection, outperforming other models. Experimental results demonstrate the effectiveness of the KNN approach for this application.

Uploaded by

Lovepreet Kaur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

International Journal of Recent Technology and Engineering (IJRTE)

ISSN: 2277-3878 (Online), Volume-8 Issue-2, July 2019

K Nearest Neighbor Based Model for Intrusion


Detection System
M.Nikhitha, M.A.Jabbar

 II. LITERATURE REVIEW


Abstract: Network security has become more important in this
A. IDS (Intrusion detection system)
digital era due to the usage of information and communications
technology (ICT). Data security is also one of the major issues in Intrusion is a type of attack or an intervention occurs within a
today’s world. Due to the usage of this ICT technologies threat to system. IDS is a software or an application is for observing and
network is also increasing. So in order to solve these problems the analyzing the traffic within the system network and protecting it
researchers has developed IDS that deals with network traffic to from intruders. Thes primary objective of IDS is to detect intrusions
identify the harmful users and hackers in the computer. In this and identify various types of attacks.
paper, we designed a model for IDS for classification of attacks Attack Types:
using K-Nearest Neighbor classifier algorithm. KNN is a IDS plays crucial role in detecting the attacks.IDS is categorized
supervised and lazy machine learning classifier, it shows its best into different attacks like DOS, Probe, R2L and U2R [4][5].
performance in terms of accuracy and classifications. 1. DOS attack: In this attack, the attacker avoids the authorized user
Experimental analysis was conducted on ISCX dataset to judge from accessing the network or making the services unavailable to
the implementation of model. The Experimental outcome shows them. Ex: Smurf, Teardrop, Neptune.
that our suggested model recorded an improved accuracy of 2. Probe attack: In these types of attacks, before initiating the attack
99.96%. the attacker will gather all the required information of the target
system. Ex: Satan, Ipsweep and Nmap.
Index Terms: Network Security, Intrusion detection system, 3. User to Root (U2R) attack: The attacker starts as a normal account
Data Security, k nearest neighbor, Machine learning user then slowly exploits vulnerabilities to obtain illegal root access
of the computer. Ex: Perl, Eject and load module.
I. INTRODUCTION 4. Remote to Local (R2L) attack: In these, the trespasser wants to
send packets to target machine remotely to expose vulnerabilities
Day by day, Internet usage has been progressively increasing with and obtain access of local target machine. Ex: multihop, send-mail
the rapid growth of network and technology. This extreme and fast and Imap
growth has given rise to new threats and vulnerabilities to networks.
Intruders are the attackers or malicious users, designing new ways B. KNN (K-Nearest Neighbor)
for the network intrusion. Previously the traditional approach like K-Nearest Neighbor is a data mining classifier. KNN is a supervised
firewalls, encryption, authentication and VPN are used in order to classifier, proposed by Fix and Hodges in 1951 [7].The output of the
secure the network infrastructure from intruders [9]. Intrusion target variable is predicted by finding the k closest neighbor, by
Detection System is an upgraded version of these technologies, calculating the Euclidean Distance. It is a non-parametric
which is mainly used to identify attacks in the network and warns the classification technique which does not make any assumptions
system if any intruder has invaded into the system [3]. about underlying data [6]. The advantages of KNN [8] are:
KNN is simplest among all the algorithms in machine learning. i. .Easy to implement and understand.
KNN is a lazy learning and also known as instance-based learning ii. It is very effective and efficient if training data is very large.
[2]. KNN is an algorithm that does not give any information about iii. It is robust for noisy data.
the structure of data which is a non-parametric. KNN algorithm is iv. It constantly evolves and easily adapts to new environment.
widely used for classification problem even it can be used for both
v. Easy to implement for multi-class problem.
classification and regression. KNN classifier shows the best in
accuracy and produces better performance than others.
In this article we suggest IDS using k nearest neighbor classifier to
improve accuracy of classifier in classification of different attack
types. Section 2 describes about Literature Review and Related
Work is narrated in Section 3. Our proposed work is explained in
Section 4 and in Section 5 we analysis Experimental Results. Finally
in section 6 we conclude.

Revised Manuscript Received on 30 July 2019.


* Correspondence Author
M.Nikhitha*, Computer Science and Engineering, Vardhaman
College of Engineering, Hyderabad, India. Fig 1: KNN Classification of Data Instances [19]
Dr.M.A.Jabbar, Computer Science and Engineering, Vardhaman
College of Engineering, Hyderabad, India.

© The Authors. Published by Blue Eyes Intelligence Engineering and


Sciences Publication (BEIESP). This is an open access article under the
CC-BY-NC-ND license http://creativecommons.org/licenses/by-nc-nd/4.0/

Retrieval Number B2458078219/19©BEIESP Published By:


DOI: 10.3940/ijrte.B2458.078219 Blue Eyes Intelligence Engineering
Journal Website: www.ijrte.org 2258 & Sciences Publication
K Nearest Neighbor Based Model for Intrusion Detection System

C. KNN Algorithm Table 1: Specifications of ISCX data set.


Step 1: Load the train and test dataset.
Step 2: Choose k value as the number of neighbors. Number Of Attributes 78
Step 3: For each data sample in test data
- Calculate Distance between the selected sample and its Number Of Instances 65536
neighbors.
- Store the distances and sort them in ascending order.
Missing Values NO
- List out the first k entries.
- Assign a class to that new sample based on the majority of the
classes present in the neighbor points. Number Of Classes 2

Step 4: Record the accuracy.


CROSS VALIDATION: Data scientist uses validation as one of
the important statistical technique to estimate the stability or skill of
III. RELATED WORK the model to see how it will react to new data [10]. Cross validation
is a re-sampling method to limit data by evaluating the models [11].
In 2018 L. Haripriya and M. A.Jabbar proposed a novel IDS using It is popular because it is easy to understand and results are less
ANN and feature subset selection [5].Authors adapted back optimistic or biased [10]. It restricts the problems like over fitting
propagation algorithm to classify the attacks along with feature and under fitting.
subset selection on KYOTO dataset. The experiment result showed One round of cross validation includes dividing some part of data
an increase in accuracy, precision, recall, and F-measure. Their into corresponding subsections, performing the analysis on one
method recorded an accuracy of 98.66%. subsection and validating the analysis on other subsections [12]. If
In 2016 Nabila Farnazz and M.A.Jabbar proposed IDS [4] using data is splitted into n equal sized subparts then out of which one
Random Forest classifier. Feature selection technique is used to subpart is for testing and remaining n-1 subparts is used for
reduce data dimensionality. The experimental result was done on validating the model. The procedure is repeated for n times and n
dataset NSL-KDD and the model is efficient for higher results are obtained which are averaged to obtain single estimation.
classification accuracy and DR. Their proposed random forest
model recorded an accuracy of 99.67%. V. EXPERIMENTAL RESULT
Md Al Mehedi Hasan et al [16] proposed IDS using RF and SVM.
Authors established two models based on these classifiers. The
implementation of these models was compared based on their values Performance measure of the classifier is calculated based on error
of accuracy, false positive rate, f-value and detection rate. matrix through which we can derive values of accuracy, Precision,
Amreen sultana and M A jabbar proposed intelligent NIDS using Recall and F-measure etc.
data mining techniques. The authors used Average One Dependence CONFUSION MATRIX: It is also called as error matrix which
Estimator (AODE) classification algorithm for detection type of describes about the classifiers or models performances on test data.
attacks [17]. The experimental analysis was done on NSL KDD It also allows the algorithms performance conceptualization
dataset. The outcome shows increase in accuracy with 97% and [13][15]. Basic confusion matrix is shown in Table 2.
detection rate with 98%.
In 2017, a novel ensemble IDS was developed [18]. The authors Table 2: Basic Confusion Matrix
used composite classifier (RFAODE) for IDS. The classifier is
combination of RF (Random forest) and Average One-Dependence Prediction
Estimator (AODE) algorithms. The implementation evaluation is
done on KYOTO dataset. The proposed model increased the Positive Negative
accuracy to 90.51% and FAR to 0.14. True False
Positiv

Positive(TP) Negative(FN)
IV. PROPOSED WORK
e
Actual

This Section discuss about proposed KNN based model for IDS. False True
Negativ

Algorithm: Intrusion Detection System using KNN. Positive(FP) Negative(TN)


Input: ISCX dataset.
Output: Attack Classification.
e

Step 1: Load ISCX dataset.


Step 2: Try Cross Validation (5 & 10 fold). ACCURACY: Accuracy is evaluated as
Step 3: Build the model using KNN. Accuracy = TP+TN/TN+TP+FP+FN
Step 4: Test data is given to KNN for classification. PRECISION: It is defined from confusion matrix as
Step 5: Calculate Accuracy, Precision, Recall and F-measure. Precision=TP/FP+TP
For Evaluation we used ISCX dataset which is in CSV format. The RECALL: Recall is explained as
proposed model performed the execution in python and calculated Recall=TP/TP+FN
the Euclidean distance based on various K values. Proposed model F-MEASURE: f-measure is explained as
also applied cross validation technique for classification with F=2* recall* precision / recall + precision
K-folds as 5 and 10. Features of ISCX dataset is shown in table 1 KNN algorithm is applied on the dataset. Cross validation for
below: k-folds values as 5 and 10 is applied for classification.

Retrieval Number B2458078219/19©BEIESP


DOI: 10.3940/ijrte.B2458.078219 Published By:
Journal Website: www.ijrte.org Blue Eyes Intelligence Engineering
2262 & Sciences Publication
International Journal of Recent Technology and Engineering (IJRTE)
ISSN: 2277-3878 (Online), Volume-8 Issue-2, July 2019

The K value ranges from K=2, 3, 4, 5, 6, 7, 8, 9 and 10. Fig 2


describes about the accuracy for full training set and Fig 3 shows the
classification of accuracy by applying 5 & 10 cross validation. The
result of our model tabulated in Table [3].
For training the dataset is splits as follows:

i. Training=80% and Testing=20% then accuracy=99.96%.


ii. Training=70% and Testing=30% then accuracy=99.96%.
iii. Training=60% and Testing=40% then accuracy=99.96%.

Fig: 4 Comparison of our approach with different models.

VI. CONCLUSION
In this paper, we applied the k nearest neighbor algorithm to IDS
data set to classify the type of attacks like Dos, probe, U2R and R2L.
Proposed method is validated by 5 and 10 cross validation for the
classification. The Experimental analysis shows that when
Fig 2: Classifying accuracy for full training set. compared to other classification methods, proposed model have
increased the accuracy, precision, recall and f-measure values. In
future, we plan to apply optimization techniques for the
classification of IDS dataset.

REFERENCES:
1. K.KanakaVardhini et.al, ”Enhanced Intrusion Detection System using
Data Reduction: An Ant Colony Optimization Approach”,
In-ternational Journal of Applied Engineering Research ISSN
0973-4562 Volume 12, Number 9 (2017) pp.1844-1847.
2. http://www.scholarpedia.org/article/K-nearest_neighbor
3. https://searchsecurity.techtarget.com/definition/intrusion-detection-sy
stem
4. Nabil Farnaaz and M.A.Jabbar, ”Random Forest Modeling for
Network Intrusion Detection System”, ELSEVIER, Science Direct
2016.
5. L.haripriya and M.A.Jabbar,” A Novel intrusion detection system
using ANN and feature subset selection”, international journal of
engineering and technology, 2018
6. https://medium.com/datadriveninvestor/knn-algorithm-and-implemen
tation-from-scratch-b9f9b739c28f
7. http://www.scholarpedia.org/article/K-nearest_neighbor
8. https://www.fromthegenesis.com/pros-and-cons-of-k-nearest-neighbo
rs/
9. Mr MohitTtiwari Raj Kumar, et al, proposed “Intrusion detection
system “, International Journal of Technical Research and Applications
Fig 3: Accuracy classification for 5 & 10 cross validation technique. in april 2017
Here, we compared our proposed method with SVM and Decision 10. https://towardsdatascience.com/cross-validation-70289113a072
Tree classifiers which obtained an accuracy of 99.66% and 99.94%. 11. https://machinelearningmastery.com/k-fold-cross-validation/
The comparisons of our approach with different models are shown 12. https://en.m.wikipedia.org/wiki/Cross-validation_(statistics)
13. https://www.geeksforgeeks.org/confusion-matrix-machine-learning/
in following fig 4 and table 4:
14. https://machinelearningmastery.com/an-introduction-to-feature-select
ion/
Table 4: Differentiating proposed model with others 15. https://classeval.wordpress.com/introduction/basic-evaluation-measur
S NO Approach Accuracy es/

1 SVM 99.66%

2 Decision Tree 99.94%

3 KNN 99.96%

Retrieval Number B2458078219/19©BEIESP Published By:


DOI: 10.3940/ijrte.B2458.078219 Blue Eyes Intelligence Engineering
Journal Website: www.ijrte.org 2260 & Sciences Publication
K Nearest Neighbor Based Model for Intrusion Detection System

16. Md. Al Mehedi Hasan, Mohammed Nasser, Biprodip and Shamim


Ahmad, Support Vector Machine and Random Forest Modeling for
IDS,JILSA, pp. 45–52, (2014).
17. Amreen Sultana and MA.Jabbar,” intelligent network intrusion
detection system using data mining techniques” IEEE explore 2017.
18. MA. Jabbar, Rajanikanth Aluvalu, Sai Satyanarayana Reddy S”, “
RFAODE: A Novel Ensemble Intrusion Detection System”, Elsevier,
ICACC-2017, 22- 24 August 2017.
19. M.A.Jabbar, B.A.Deekshatulu, p.chandra, “Heart Disease
classification using nearest neighbor classifier using Feature subset
selection”, Anale. Seria Informatică. Vol. XI fasc. 1 – 2013.

AUTHORS PROFILE

M.Nikhitha is a research scholar at the Computer


Science and Engineering Department, Vardhaman
College of Engineering, Hyderabad, Telangana, India .

Dr. M.A.JABBAR is a Vice chair, IEEE CS chapter


,Hyderabad Section and Professor and Centre Head at the Computer
Science and Engineering Department, Vardhaman College of Engineering,
Hyderabad, Telangana, India. He has been teaching for more than 19 years.
He obtained Doctor of Philosophy (Ph.D.) from JNTUH. He published more
than 50 papers in various journals and conferences. He is Reviewer for
Scopus and SCI journals like Springer, Elsevier, and IEEE Transactions on
Systems Man and Cybernetics, Wiley. He served as a technical committee
member for more than 40 international conferences. He has been Editor for
1st ICMLSC 2018 international conference held during 22 nd and 23rd June
2018 at Hyderabad.

Retrieval Number B2458078219/19©BEIESP


DOI: 10.3940/ijrte.B2458.078219 Published By:
Journal Website: www.ijrte.org Blue Eyes Intelligence Engineering
2262 & Sciences Publication
K Nearest Neighbor Based Model for Intrusion Detection System

Table 3: Classification Accuracy Using Full Training and Cross validation.

Retrieval Number B2458078219/19©BEIESP


DOI: 10.3940/ijrte.B2458.078219 Published By:
Journal Website: www.ijrte.org Blue Eyes Intelligence Engineering
2262 & Sciences Publication

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy