Feature Selection Techniques
Feature Selection Techniques
Abstract- One of the major challenges these days is dealing with large amount of data extracted from the network that
needs to be analyzed. Feature Selection plays the very important role in Intrusion Detection System. Feature Selection
assists in selecting the minimum number of features from the number of features that need more computation time, large
space, etc. Feature selection has become interest to many research areas which deal with machine learning and data
mining, because it provides the classifiers to be fast, cost-effective, and more accurate.
I. INTRODUCTION
Due to availability of large amounts of data from the last few decades, the analysis of data becomes more difficult
manually. So the data analysis should be done computerized through Data Mining. Data Mining helps in fetching the hidden
attributes on the basis of pattern, rules, so on. Data Mining is the only hope for clearing the confusion of patterns. Basically, the
data gathered from the network are a raw data and contains large log files that need to be compressed. So the various feature
selection techniques are used for eliminating the irrelevant or redundant features from the dataset. Feature selection [FS] is the
processes that choose a subset of relevant features for building the model. Feature selection is one of the frequently used and most
important techniques in data preprocessing for data mining [1].The goal of feature selection for classification task is to maximize
classification accuracy [2].Feature selection is the process of removing redundant or irrelevant features from the original data set.
So the carrying out time of the classifier that processes the data will decreases and also accuracy increases because irrelevant
features can include noisy data affecting the classification accuracy negatively [3]. With feature selection the understandability
can be improved and cost of data handling becomes smaller [4].
Data holds many features, but all the features may not be related so the feature selection is used so as to eliminate the
unrelated features from the data without much loss of the information. Feature selection is also known as attributes selection or
variable selection [5]. The feature selection is of three types:
• Filter approach
• Wrapper approach
• Embedded approach
IJSDR1706087 International Journal of Scientific Development and Research (IJSDR) www.ijsdr.org 594
ISSN: 2455-2631 © June 2017 IJSDR | Volume 2, Issue 6
feature dependencies and drawback of this method is that it is slower than the filter method because it takes the dependencies
also. The quality of the feature is directly measured by the performance of the classifier.
Generate a Learning
Set of all Subset Algorithm +
Features Performance
(1)
IJSDR1706087 International Journal of Scientific Development and Research (IJSDR) www.ijsdr.org 595
ISSN: 2455-2631 © June 2017 IJSDR | Volume 2, Issue 6
𝒄𝒐𝒗(𝒙𝒊 ,𝒀)
𝑹(𝒊) = (3)
𝒗𝒂𝒓 𝒙𝒊 ∗𝒗𝒂𝒓(𝒀)
Where ‗n‗ is the number of classes, and the Pi is the probability of S belongs to class ‗i‗. The gain of A and S is calculated as:
𝒎
𝐒𝐤
𝑮𝒂𝒊𝒏 𝑨 = 𝑬𝒏𝒕𝒓𝒐𝒑𝒚 𝑺 − ∗ 𝐄𝐧𝐭𝐫𝐨𝐩𝐲 𝐒𝐤 (5)
𝒌=𝟏 𝐒
Where, Sk is the subset of S.
𝐇 𝐗 =− 𝒊𝐏 𝐲 𝐥𝐨𝐠 𝐏 𝐲 (6)
Above equation represents the uncertainty (information content) in output Y. Suppose we observe a variable X then the
conditional entropy is given by:
Above equation implies that by observing a variable X, the uncertainty in the output Y is reduced. The decrease in uncertainty is
given as:
This gives the MI between Y and X meaning that if X and Y are independent then MI will be zero and greater than zero if they
are dependent. This implies that one variable can provide information about the other thus proving dependency. The definitions
provided above are given for discrete variables and the same can be obtained for continuous variables by replacing the
summations with integrations.
IJSDR1706087 International Journal of Scientific Development and Research (IJSDR) www.ijsdr.org 596
ISSN: 2455-2631 © June 2017 IJSDR | Volume 2, Issue 6
redundancy analysis, which selects predominant features from the relevant set obtained in the first stage. This selection is an
iterative process that removes those variables which form an approximate Markov blanket. Symmetrical Uncertainty (SU) is a
normalized information theoretic measure which uses entropy and conditional entropy values to calculate dependencies of
features. In Symmetrical Uncertainty the value 0 indicates that two features are totally independent and value of 1 indicates that
using one feature other feature's value can be totally predicted.
IJSDR1706087 International Journal of Scientific Development and Research (IJSDR) www.ijsdr.org 597
ISSN: 2455-2631 © June 2017 IJSDR | Volume 2, Issue 6
IV. CONCLUSION
Feature selection is an important issue in classification, because it may have a considerable effect on accuracy of the
classifier. It reduces the number of dimensions of the dataset, so the processor and memory usage reduce; the data becomes more
comprehensible and easier to study on. In this study, various feature selection techniques have been discussed and among the
three approaches to feature selection method, filter methods should be used to get results in lesser time and for large datasets. If
the results to be accurate and optimal, wrapper method like GA should be used.
REFERENCES
[1] Asha Gowda Karegowda, M.A.Jayaram and A.S. Manjunath, ―Feature Subset Selection Problem using Wrapper Approach in
Supervised Learning‖, International Journal of Computer Applications, Vol. 1, No. 7, pp. 0975–8887, 2010.
[2] Ron Kohavi, George H. John, ―Wrappers for feature subset Selection‖, Artificial Intelligence, Vol. 97, No. 1-2. pp. 273-324,
1997.
[3] S. Doraisami, S. Golzari, A Study on Feature Selection and Classification Techniques for Automatic Genre Classification of
Traditional Malay Music, Content-Based Retrieval, Categorization and Similarity, 2008
[4] A. Arauzo-Azofra, J. L. Aznarte, and J. M. Benítez, Empirical study of feature selection methods based on individual feature
evaluation for classification problems, Expert Systems with Applications, 38 (2011) 8170-8177.
[5] Beniwal, S., & Arora, J. (2012). Classification and feature selection techniques in data mining. International Journal of
Engineering Research & Technology (IJERT), 1(6).
[6] Uysal, A. K., & Gunal, S., ―A novel probabilistic feature selection method for text classification,‖ Knowledge-Based
Systems, 36, 226–235, 2012.
[7] S. Guan, J. Liu, Y. Qi, ―An incremental approach to contribution-based feature selection,‖ Journal of Intelligence Systems
13 (1), 2004.
[8] M.M. Kabir, M.M. Islam, K. Murase, ―A new wrapper feature selection approach using neural network,‖ in: Proceedings of
the Joint Fourth International Conference on Soft Computing and Intelligent Systems and Ninth International Symposium on
Advanced Intelligent Systems (SCIS&ISIS2008), Japan, pp. 1953–1958, 2008.
[9] M.M. Kabir, M.M. Islam, K. Murase, ―A new wrapper feature selection approach using neural network,‖ Neurocomputing
73, 3273–3283, May 2010.
[10] E. Gasca, J. Sanchez, R. Alonso, ―Eliminating redundancy and irrelevance using a new MLP-based feature selection
method,‖ Pattern Recognition 39, 313–315, 2006.
[11] C. Hsu, H. Huang, D. Schuschel, ―The ANNIGMA-wrapper approach to fast feature selection for neural nets,‖ IEEE
Transaction son Systems, Man, and Cybernetics—Part B:Cybernetics32(2)207–212, April 2002.
[12] A. Ghareb , A. Bakar, A. Hamdan, ―Hybrid feature selection based on enhanced genetic algorithm for text categorization,‖
Expert SystemsWith Applications, Elsevier, 2015.
[13] R.K. Sivagaminathan, S. Ramakrishnan, ―A hybrid approach for feature subset selection using neural networks and ant
colony optimization,‖ Expert Systems with Applications 33, 49–60, 2007.
[14] X. Wang, J. Yang, X. Teng, W. Xia, R. Jensen, ―Feature selection based on rough sets and particle swarm optimization,‖
Pattern Recognition Letters 28 (4), 459–471, November 2006.
[15] M.M. Kabir, M.M. Islam, K. Murase, ―A new local search based hybrid genetic algorithm for feature selection,‖
Neurocomputing 74, 2194-2928, May 2011.
[16] M. Zhu, J. Song, ―An Embedded Backward Feature Selection Method for MCLP Classification Algorithm,‖ Information
Technology and Quantitative Management, Elsevier, 2013.
IJSDR1706087 International Journal of Scientific Development and Research (IJSDR) www.ijsdr.org 598