Abstract
Feature selection methods are used to identify and remove irrelevant and redundant attributes from the original feature vector that do not have much contribution to enhance the performance of a predictive model. Meta-heuristic feature selection algorithms, used as a solution to this problem, need to have a good trade-off between exploitation and exploration of the search space. Genetic Algorithm (GA), a popular meta-heuristic algorithm, lacks exploitation capability, which in turn affects the local search ability of the algorithm. Basically, GA uses mutation operation to take care of exploitation which has certain limitations. As a result, GA gets stuck in local optima. To encounter this problem, in the present work, we have intelligently blended the Great Deluge Algorithm (GDA), a local search algorithm, with GA. Here GDA is used in place of mutation operation of the GA. Application of GDA yields a high degree of exploitation through the use of perturbation of candidate solutions. The proposed method is named as Deluge based Genetic Algorithm (DGA). We have applied the DGA on 15 publicly available standard datasets taken from the UCI dataset repository. To show the classifier independent nature of the proposed feature selection method, we have used 3 different classifiers namely K-Nearest Neighbour (KNN), Multi-layer Perceptron (MLP) and Support Vector Machine (SVM). Comparison of DGA has been performed with other contemporary algorithms like the basic version of GA, Particle Swarm Optimisation (PSO), Simulated Annealing (SA) and Histogram based Multi-Objective GA (HMOGA). From the comparison results, it has been observed that DGA performs much better than others in most of the cases. Thus, our main contributions in this paper are introduction of a new variant of GA for FS which uses GDA to strengthen its exploitational ability and application of the proposed method on 15 well-known UCI datasets using KNN, MLP and SVM classifiers.







Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Malakar S, Ghosh M, Bhowmik S, Sarkar R, Nasipuri M (2019) A GA based hierarchical feature selection approach for handwritten word recognition. Neural Comput Appl 1–20
Ghosh M, Malakar S, Bhowmik S, Sarkar R, Nasipuri M (2019) Feature selection for handwritten word recognition using memetic algorithm. In: Advances in intelligent computing. Springer, pp 103–124
Culberson JC (1996) On the futility of blind search. Technical Report TR 96-18, University of Alberta, Department of Computing Science, Edmonton, Alberta, Canada,
Glover F (1989) Tabu search—part I. ORSA J Comput 1(3):190–206
Van Laarhoven AEH (1987) Simulated annealing. in simulated annealing: theory and applications. Springer, Dordrecht, pp 7–15
Problem-specific knowledge in heuristics. [Online]. http://antor.uantwerpen.be/problem-specific-knowledge-in-heuristics/. Accessed 07 01 2019
Kazakovtsev AL, Antamoshkin AN, Fedosov VV (2016) Greedy heuristic algorithm for solving series of eee components classification problem. In: IOP conference series: materials science and engineering, vol 122(1)
Dorigo M, Birattari M (2011) Ant colony optimization. In: Encyclopedia of machine learning, Springer, pp 36–39
Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: Proceedings of the Sixth International Symposium on micro machine and human science. MHS’95. pp 39–43
Yang J, Honavar V (1998) Feature subset selection using a genetic algorithm. IEEE Intell Syst 13:44–49
Rashedi E, Nezamabadi-pour H, Saryazdi S (2009) GSA: a gravitational search algorithm. Inf Sci (Ny) 179(13):2232–2248
Liu H, Motoda H (2007) Computational methods of feature selection, vol. 20071386. CRC Press, London
Mitra P, Murthy CA, Pal SK (2002) Unsupervised feature selection using feature similarity. IEEE Trans Pattern Anal Mach Intell 24(3):301–312
Belli S, López C, Romano J (2007) La excepcionalidad del otro. Athenea Digit. 11:104–113
Yang J, Honavar V (1998) Feature subset selection using a genetic algorithm. IEEE Intell Syst their Appl 13(2):44–49
Duval B, Hao J-K, Hernandez Hernandez JC (2009) A memetic algorithm for gene selection and molecular classification of cancer. In: Proceedings of the 11th annual conference on genetic and evolutionary computation—GECCO’09, p 201
Zhu Z, Ong YS, Dash M (2007) Markov blanket-embedded genetic algorithm for gene selection. Pattern Recognit 40(11):3236–3248
Ghosh M, Begum S, Sarkar R, Chakraborty D, Maulik U (2019) Recursive memetic algorithm for gene selection in microarray data. Expert Syst Appl 116:172–185
Ghosh M, Adhikary S, Ghosh KK, Sardar A, Begum S, Sarkar R (2019) Genetic algorithm based cancerous gene identification from microarray data using ensemble of filter methods. Med Biol Eng Comput 57(1):159–176
Ghosh M, Guha R, Mondal R, Singh PK, Sarkar R (2017) Feature selection using histogram based multi-objective GA for Handwritten Devanagari numeral recognition.
Dueck G (1993) New optimization heuristics. J Comput Phys 104(1):86–92
Leard R, Farmaceutiche T, Salern B (1996) 3 genetic algorithms in feature selection. pp. 67–86
Huang J (2007) A hybrid genetic algorithm for feature selection wrapper based on mutual information. vol 28, pp 1825–1844,
Oh I-S, Lee J-S, Moon B-R (2004) Hybrid genetic algorithms for feature selection. IEEE Trans Pattern Anal Mach Intell 26(11):1424–1437
Siedlecki JW, Sklansky (1993) A note on genetic algorithms for large-scale feature selection. vol 10, pp 88–107
Dueck TG, Scheuer (1990) Threshold accepting: a general purpose optimization algorithm appearing superior to simulated annealing. J Comput Phys 90(1):161–175
Baykasoglu A (2012) Design optimization with chaos embedded great deluge algorithm. Appl Soft Comput J 12(3):1055–1067
Landa-Silva D, Obit JH (2009) Evolutionary non-linear great deluge for university course timetabling. In: International conference on hybrid artificial intelligence systems, pp 269–276
Mccollum B, Mcmullan PJ, Parkes AJ, Burke EK, Abdullah S (2009) An extended great deluge approach to the examination timetabling problem. pp 10–12
Mafarja M, Abdullah S (2011) Modified great deluge for attribute reduction in rough set theory. In: Proceedings—2011 8th international conference on fuzzy systems and knowledge discovery, FSKD 2011, vol 3, pp 1464–1469
Badawi UA, Khalil M, Alsmadi S (2013) A hybrid memetic algorithm (genetic algorithm and great deluge local search) with back-propagation classifier for fish recognition. 10(2):348–356
Lipowski A, Lipowska D (2012) Roulette-wheel selection via stochastic acceptance. Phys A Stat Mech its Appl 391(6):2193–2196
De Jong KA, Spears WM (1992) A formal analysis of the role of multi-point crossover in genetic algorithms. Ann Math Artif Intell 5(1):1–26
UCI repository. [Online]. https://archive.ics.uci.edu/ml/datasets.html. Accessed 07 Jan 2019
Ablavsky V, Stevens MR (2003) Automatic feature selection with applications to script identification of degraded documents. null, p 750
Basu S, Das N, Sarkar R, Kundu M, Nasipuri M, Basu DK (2005) Handwritten ‘Bangla’ alphabet recognition using an MLP based classfier. In: 2nd National Conf. on computer processing of Bangla-2005, pp 285–291
Chaudhari S, Gulati M (2016) Script identification using Gabor feature and SVM classifier. Proc Comput Sci 79:85–92
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Guha, R., Ghosh, M., Kapri, S. et al. Deluge based Genetic Algorithm for feature selection. Evol. Intel. 14, 357–367 (2021). https://doi.org/10.1007/s12065-019-00218-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12065-019-00218-5