Detection of Spyware by Mining Executable Files
Detection of Spyware by Mining Executable Files
Objectives
The main objective of our project is to establish a method in spyware detection
research using data mining techniques. These techniques are used for information
retrieval and classification. In application of techniques, there was only one change that
computer programs were used rather than text documents.
In this project, binary features are extracted from executable files. A feature
reduction method is then used to obtain a subset of data which is further used as a
training set for automatically generating classifiers. In this method, the generated
classifiers are used to classify new, previously unseen binaries as either legitimate
software or spyware. We will use appropriate value of n in order to yield high
performance, also suitable machine learning algorithm to produce high accuracy.
Project idea
The goal of the project is to detect spyware by using data mining and machine
learning. We use the Waikato Environment for Knowledge Analysis (WEKA) to perform
the experiments. WEKA is a suite of machine learning algorithms and analysis tools,
which is used in practice for solving data mining problems. First, we extract features
from the binary files and we then apply a feature reduction method in order to reduce data
set complexity. Finally, we convert the reduced feature set into the Attribute Relation File
Format (ARFF). ARFF files are ASCII text files that include a set of data instances, each
described by a set of features. Figure 2.1 shows the steps involved in our proposed
method.
considered in one class. To obtain Reduced Feature Sets (RFSs) for CFBE and FBFE,
merge unique n-grams for both classes.
Hardware Requirements
HDD, 40 GB or more.
Software Requirements
Platform: Linux OS
Language: JAVA