Effective Data Mining Techniques For Intrusion Detection and Prevention System
Effective Data Mining Techniques For Intrusion Detection and Prevention System
packet activity that might indicate malicious at different penetration points are normally recorded
behavior. Intrusion detection system(IDS) and in different audit data sources, an IDS often needs to
intrusion prevention system (IPS) that detect attack be extended to incorporate (additional) modules that
based on specific signatures of known threats specialize on certain components (e.g., hosts, subnets,
similar to the way antivirus software typically detects etc.) of the network systems. The large traffic volume
and protects against malware- and there are IDS that in security related mailing lists and Web sites suggest
detect based on comparing traffic patterns against a that new system security holes and intrusion methods
baseline and looking for anomalies[2]. are continuously being discovered[6].
This paper introduces the Network Intrusion The first threat for a computer network system was
Detection System (NIDS), which uses a suite of data realised in 1988 when 23-year old Robert Morris
mining techniques to automatically detect the attacks launched the first worm, which overid over 6000 PCs
against computer networks and systems. While the of the ARPANET network. On February 7th, 2000
long-term objective of NIDS is to address all aspects the first DoS attacks of great volume where launched,
of intrusion detection, in this paper we present two targeting the computer systems of large companies
specific contributions: like Yahoo!, eBay, Amazon, CNN, ZDnet and Dadet.
(i) an unsupervised anomaly detection technique that More details on these attacks can be found at [8]
assigns a score to each network connection that If the network is small and signatures are kept up to
reflects how anomalous the connection is, and (ii) an date, the human analyst for intrusion detection works
association pattern analysis based module that well. If some organizations have a large, complex
summarizes those network connections that are network then intrusion detection become done by the
ranked highly anomalous by the anomaly detection number of alarms and all generate alarm need to
module[3]. review.
The sensors on the MITRE network, for example,
currently generate over one million alarms per day.
And that number is increasing. This situation arises
from ever increasing attacks on the network, as well
as a tendency for sensor patterns to be insufficiently
selective (i.e., raise too many false alarms).
Commercial tools typically do not provide an
enterprise level view of alarms generated by multiple
sensor vendors [9].
Intrusion Detection before Data Mining.
When we start the intrusion detection on our
organizations network, that time we didn’t focus on
data mining, but rather on more issues:
How alarm generated? How much data would we
Fig 1.Intrusion Detection and Prevention Scorecard
[4].
get? How would we show the data? And What type
of data we want to monitor or see?
In intrusion system used the following
We began to suspect that our system was inadequate
method :
for detecting the most dangerous attacks—those
Distributed Denial of Service.
performed by adversaries using attacks that are new,
Viruses and Worms.
stealthy, or both. So we considered data mining with
P Spoofing.
two questions in mind: [9]
Network/Port Scans.
All IDS need an information source in which to Can we develop a way to minimize what the
monitor for intrusive behavior. The information analysts need to look at daily?
source can include: network traffic (packets), host Can data mining help us find attacks that the
resource (CPU, I/O operations, and log files), user sensors and analysts did not find?
activity and file activity, etc[4]. Data Mining: What is it?
There are multiple ―penetration points‖ for intrusions Data mining is the process of extracting patterns from
to take place in a network system. For example, at the large datasetbycombiningmethodsfrom statistician art
network level, carefully created ―malicious‖ IP ificial intelligence with database management.
packets can destroy a victim host; at the network host In intrusion detection(IDS)and intrusion prevention
level, vulnerabilities in system software can be system(IPS) we consider some things that are used in
exploited to yield an illegal root shell. Since activities
1131
International Conference on Advanced Computing, Communication and Networks’11
data mining for intrusion detection(IDS) and True Positive: A legitimate attack which
intrusion prevention system(IPS) . triggers an IDS to produce an alarm and
Remove activity from alarm data. show attack has take place
Identify false alarm generators and False Positive: An event generated signal an
attack sensor signatures. IDS to produce an alarm when no attack has
Identify long, ongoing,IP packets. taken place.
Find bad activity. Alarm filtering: The process of categorizing
attack alerts produced from an IDS in order
to distinguish false positives from actual
2. LITERATURE SURVEY
attacks.
Intrusion detection (IDS) and intrusion prevention Feature selection from the available data is vital to
system (IPS) a war that must be fought day and night, the effectiveness of the methods employed.
without rest, on thousands of company and its Researchers apply various analysis procedures to the
network. Winning strategies will be as varied as the accumulated data, in order to select the set of features
organizations designing them, but none will succeed that they think maximizes the effectiveness of their
without a comprehensive solution for securing the data mining techniques. Table I contains some
data. Data security depends on a complete but examples of the features selected. Each of these
flexible toolset capable of managing, maintaining and features offers a valuable piece of information to the
securing. The goal of intrusion detection is to System. Extracted features can be ranked with
monitor network assets to detect anomalous behavior respect to their contribution and utilized
and misuse. This concept has been around for nearly accordingly[10]
twenty years but only recently has it seen a dramatic
rise in popularity and incorporation into the overall
information security infrastructure.
In current architecture for intrusion detection is
shown in Figure 2 Network traffic is analyzed by a
variety of available sensors. This sensor data is pulled
periodically to a central server for conditioning and
input to a relational database. HOMER filters events
from the sensor data before they are passed on to the
classifier and clustering analyses. Data mining tools
filter false alarms and identify anomalous behavior in
the large amounts of remaining data. A web server is
available as a front end to the database if needed, and
analysts can launch a number of predefined queries
as well as free form SQL queries from this
interface[9].
Table1 Table of feature that have been applied for (IDS) and (IPS)
for data miming.[10].
Fig 3 How ensors feed into overall intrusion detection Data Mining based IDS generally not give good
system result in a simulated environment and then deployed
in a real environment. They generate a lot of false
In Intrusion detection system(IDS) and intrusion alarms and positive alarms.
prevention system (IPS) contain some Terminologies: Current weaknesses in intrusion detection and
intrusion prevention system include new attacks the
are not detected until someone has generated a rule or
Alert/Alarm: A signal show that system has
been or is being attacked. signature for that specific attack. Also, most attacks
need a alteration in existing malicious code in order
to bypass existing signatures. Hence, new signatures
are generally created manually.
1132
International Conference on Advanced Computing, Communication and Networks’11
3. METHODOLOGY
In our paper we will propose the following methods
for intrusion detection and intrusion prevention
system for data miming. Data Mining may be
thought of as the most interesting one in
accomplishment of intrusion detection and intrusion
prevention system. In IDS and IPS used for Data
Mining used for to discover consistent and useful
patterns of system features that describe user Fig 4. Data Mining Phases.
behavior.In intrusion detection and intrusion The intrusion detection(IDS) and intrusion prevention
prevention system contain two types . system is an integrated system which uses both
Misuse-based system. misuse-based and anomaly based approaches.
Anomaly-based system Techniques for intrusion detection and intrusion
Thus we can introduce INIDS. Not only will INIDS prevention system for data mining are as following.
be an integrated system which uses both misuse-
based and anomaly based approaches, but it also I] Classification rules
The classification rules used to discover attacks in a
implements a classification rules again on the data.
TCPdump . These classification rule used to
Misuse-based Anomaly-based
accurately capture the behavior of intrusions and
The attacks uncovered The normal packets normal activities for data mining system. The
under this are assumed to separated under this are classification rule that we use is the decision tree .
Decision Tree: Decision tree induction is the
be true positives. assumed to be true learning of decision trees from class-labeled training
tuples. A decision tree is a flowchart like tree
negatives. structure, where each internal node denotes a test on
It risks high porosity It risks the chances of an attribute, each branch represents an outcome of the
test, and each leaf node holds a class label.
towards new and normal but undefined The topmost node in the tree is the root node.To
undefined attacks. packets to be tagged as decide which attributes will decide how the tree
should form we need an attribute selection measure.
abnormal data. The method that we use is called information gain.
Classification and prediction are two forms of data
It has a chance of failure It has a tendency to show analysis that can be used to extract models describing
to capture many attacks. greater number of false important data classes or to predict future data trends.
For example, a classification model can be built to
positives. categorize bank loan applications as either safe or
Table 2 comparison among Misuse-based and Anomaly- risky. In other word, classification maps a data item
based. into one of several pre-defined categories. These
Data Mining-based intrusion detection systems algorithms normally output ―classifiers‖. A
have demonstrated high accuracy, good prediction model can be built to predict the
generalization to novel types of intrusion, and expenditures of potential customers on computer
robust behavior in a changing environment, In equipment given their income and occupation.
Figure 4 we depicted (Pei etal.: Data Mining An ideal application in intrusion detection would
Techniques for Intrusion Detection and be to gather sufficient ―normal‖ and ―abnormal‖ audit
Computer Security)[11]. data for a user or a program, then apply a
classification algorithm to learn a classifier that can
label or predict new unseen audit data as belonging to
the normal class orthe abnormal class;
II] knowledge discovery in databases (KDD)
KDD can be defined as ―the nontrivial process of
identifying valid, novel, potentially useful, and
ultimately understandable patterns in data. Data
mining is a particular step in this process in which
1133
International Conference on Advanced Computing, Communication and Networks’11
specific algorithms are applied to extract patterns [9] Data Mining for Network Intrusion Detection: How to Get
Started By Eric Bloedorn, Alan D. Christiansen, William
from data.
Hill,Clement Skorupka, Lisa M. Talbot, Jonathan TivelThe
The KDD process involves a number of steps and is MITRE Corporation.
often interactive, iterative, and user-driven.[12] [10] Data Mining Techniques for (Network) IntrusionDetection
Systems Theodoros Lappas and Konstantinos Pelechrinis[Online]
Available on http://www.mendeley.com/research/data-
Getting to know the application domain: warehousing-and-data-mining-techniques-for-intrusion-detection-
trying to understand the data and the systems/
discovery task. [11]‖ An architectural design for a hybrid intrusion detection
Data Mining: includes first deciding what system for data base ―,Mohammad hossein haratian.
[Online] Available on
model, for example, summarization, www.citeseerx.ist.psu.edu/viewdoc/download.
classification, or clustering is to be derived [12]‖ A Data Mining Framework for Constructing Features and
from the data. Models for Intrusion DetectionSystems‖.,Wenke Lee,[Online]
Using the discovered knowledge: includes Available on.
http://portal.acm.org/citation.cfm?id=929987&dl=GUIDE&coll=G
incorporating the knowledge into a UIDE
production system, or simply reporting it to
interested parties.
CONCLUSIONS
In this paper we first identify the type of attack take
place on network or data base . The classification
rules can be used for intrusion detection(IDS) and
intrusion prevention system(IPS) to classify the
attack and signature . Many possibilities have been
considered, even the incorporation of artificial
intelligence . We have shown the ways in which data
mining has been known to aid the process of
Intrusion Detection and the ways in which the
various techniques have been applied for intrusion
detection(IDS) and intrusion prevention system(IPS).
REFERENCES
[1] J.P Anderson. Computer Security Threat Monitoring and
Surveillance. Technical report, James P Anderson Co., Fort
Washington, Pennsylvania.
[2] Bradley, T. (n.d.). ,‖Introduction to Intrusion Detection
Systems(IDS‖).[Online]Availableon
http://netsecurity.about.com/cs/hackertools/a/aa030504.html.
[3] ―Network Intrusion Detection System (NIDS) Using Data
Mining Techniques‖ [Online] Available on
http://etrx.spit.ac.in/ieee_colloquium/Information_Security/spit-
265.pdf.
[4] Intrusion Detection and Prevention Scorecard[Online]
Available on http://www.strategy2act.com/solutions/scorecard-
reports/bsc_intrusion_prevention.html.
[5] What is an IDS ? [Online] Available on
http://www.idstutorial.com/what-is-ids.php.
[6]‖ A Data Mining Framework for Building Intrusion Detection
Models1‖.[Online] Available on,
http://citeseerx.ist.psu.edu/viewdoc/download.
[7] ―. Network intrusion detection (IDS) and intrusion protection
system (IPS)‖ [Online] Available on
http://www.cdacbangalore.in/design/corporate_site/override/pdf-
doc/projects/GYN.pdf.
[8] Available on http://www.securityfocus.com/news/2445.
1134