Abstract
This chapter provides a formal definition of the problem of cluster analysis, and the related problem of community detection in graphs. Building on the mathematical definition of these problems, we motivate the use of evolutionary computation in this setting. We then review previous work on this topic, highlighting key approaches regarding the choice of representation and objective functions, as well as regarding the final process of model selection. Finally, we discuss successful applications of evolutionary clustering and the steps we consider necessary to encourage the uptake of these techniques in mainstream machine learning.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Cluster validity indices can be external or internal, depending on whether or not they depend on knowledge of the correct partition (ground truth) to determine solution quality.
References
Aljarah, I., Faris, H., Mirjalili, S.: Evolutionary data clustering: Algorithms and applications. Springer, Berlin (2021)
Aljarah, I., Habib, M., Nujoom, R., Faris, H., Mirjalili, S.: A comprehensive review of evaluation and fitness measures for evolutionary data clustering. In: Aljarah, I., Faris, H., Mirjalili, S. (eds.) Evolutionary Data Clustering: Algorithms and Applications, pp. 23–71. Springer Singapore, Singapore (2021)
Arbelaitz, O., Gurrutxaga, I., Muguerza, J., Pérez, J.M., Perona, I.: An extensive comparative study of cluster validity indices. Pattern Recogn. 46(1), 243–256 (2013)
Bandyopadhyay, S., Maulik, U.: Genetic clustering for automatic evolution of clusters and application to image classification. Pattern Recogn. 35(6), 1197–1208 (2002)
Bandyopadhyay, S., Mukhopadhyay, A., Maulik, U.: An improved algorithm for clustering gene expression data. Bioinformatics 23(21), 2859 (2007)
Bayá, A.E., Granitto, P.M.: How many clusters: a validation index for arbitrary-shaped clusters. IEEE/ACM Trans. Comput. Biol. Bioinf. 10(2), 401–414 (2013)
Caballero, R., Laguna, M., Martí, R., Molina, J.: Scatter Tabu search for multiobjective clustering problems. J. Oper. Res. Soc. 62(11), 2034–2046 (2011)
Chhabra, A., Masalkovaitė, K., Mohapatra, P.: An overview of fairness in clustering. IEEE Access 9, 130698–130720 (2021)
Davidson, I., Ravi, S.S.: Clustering with constraints: Feasibility issues and the k-means algorithm. In: Proceedings of the 2005 SIAM International Conference on Data Mining, pp. 138–149. SIAM (2005)
Deb, K., Gupta, S.: Understanding knee points in bicriteria problems and their implications as preferred solution principles. Eng. Optim. 43(11), 1175–1204 (2011)
Falkenauer, E.: Genetic Algorithms and Grouping Problems. Wiley Ltd, Chichester (1998)
Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3), 75–174 (2010)
Fred, A.L.N., Jain, A.K.: Combining multiple clusterings using evidence accumulation. IEEE Trans. Pattern Anal. Mach. Intell. 27(6), 835–850 (2005)
Garcia-Piquer, A., Bacardit, J., Fornells, A., Golobardes, E.: Scaling-up multiobjective evolutionary clustering algorithms using stratification. Pattern Recogn. Lett. 93, 69–77 (2017)
Garcia-Piquer, A., Sancho-Asensio, A., Fornells, A., Golobardes, E., Corral, G., Teixidó-Navarro, F.: Toward high performance solution retrieval in multiobjective clustering. Inf. Sci. 320, 12–25 (2015)
Garza-Fabre, M., Handl, J., Knowles, J.: A new reduced-length genetic representation for evolutionary multiobjective clustering. In: Trautmann, H., Rudolph, G., Klamroth, K., Schütze, O., Wiecek, M., Jin, Y., Grimme, C. (eds.) Evolutionary Multi-Criterion Optimization, pp. 236–251. Springer International Publishing, Münster (2017)
Garza-Fabre, M., Handl, J., Knowles, J.: An improved and more scalable evolutionary approach to multiobjective clustering. IEEE Trans. Evol. Comput. 22(4), 515–535 (2018)
Garza-Fabre, M., Handl, J., José-García, A.: Evolutionary multi-objective clustering over multiple conflicting data views. IEEE Trans. Evolut, Comput (2022)
Garza-Fabre, M., Sánchez-Martínez, A.L., Aldana-Bobadilla, E., Landa, R.: Decision making in evolutionary multiobjective clustering: a machine learning challenge. IEEE Access 1–22 (2022)
Gharehchopogh, F.S., Abdollahzadeh, B., Khodadadi, N., Mirjalili, S.: Chapter 20 - metaheuristics for clustering problems. In: Mirjalili, S., Gandomi, A.H. (eds.), Comprehensive Metaheuristics, pp. 379–392. Academic (2023)
Golalipour, K., Akbari, E., Hamidi, S.S., Lee, M., Enayatifar, R.: From clustering to clustering ensemble selection: a review. Eng. Appl. Artif. Intell. 104, 104388 (2021)
Gong, C., Chen, H., He, W., Zhang, Z.: Improved multi-objective clustering algorithm using particle swarm optimization. PLoS ONE 12(12), e0188815 (2017)
Gupta, A., Datta, S., Das, S.: Fuzzy clustering to identify clusters at different levels of fuzziness: an evolutionary multiobjective optimization approach. IEEE Trans. Cybern. 51, 2601–2611 (2021)
Handl, J., Knowles, J.: An evolutionary approach to multiobjective clustering. IEEE Trans. Evol. Comput. 11(1), 56–76 (2007)
Handl, J., Knowles, J.: Evidence accumulation in multiobjective data clustering. In: International Conference on Evolutionary Multi-Criterion Optimization, pp. 543–557. Springer (2013)
Handl, J., Knowles, J., Kell, D.B.: Computational cluster validation in post-genomic data analysis. Bioinformatics 21(15), 3201–3212 (2005)
Handl, J., Ospina-Forero, L., Cann, T.: Multi-objective community detection for bipartite graphs. Under Submission (2022)
Hansen, P., Jaumard, B.: Minimum sum of diameters clustering. J. Classif. 4(2), 215–226 (1987)
He, Z., Yen, G.G., Ding, J.: Knee-based decision making and visualization in many-objective optimization. IEEE Trans. Evol. Comput. 25(2), 292–306 (2021)
Hruschka, E.R., Campello, R.J.G.B., Freitas, A.A., Ponce, A.C., de Carvalho, L.F.: A survey of evolutionary algorithms for clustering. IEEE Trans. Syst., Man, Cybern., Part C (Appl. Rev.) 39(2), 133–155 (2009)
Ikotun, A.M., Ezugwu, A.E., Abualigah, L., Abuhaija, B., Heming, J.: K-means clustering algorithms: a comprehensive review, variants analysis, and advances in the era of big data. Inf. Sci. 622, 178–210 (2023)
Ishibuchi, H., Tsukamoto, N., Nojima, Y.: Evolutionary many-objective optimization: a short review. In: 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence), pp. 2419–2426. IEEE (2008)
Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recogn. Lett. 31(8), 651–666 (2010)
José-García, A., Gómez-Flores, W.: Automatic clustering using nature-inspired metaheuristics: a survey. Appl. Soft Comput. 41, 192–213 (2016)
José-García, A., Gómez-Flores, W.: A survey of cluster validity indices for automatic data clustering using differential evolution. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO ’21, pp. 314–322. ACM, New York (2021)
José-García, A., Handl, J., Gómez-Flores, W., Garza-Fabre, M.: An evolutionary many-objective approach to multiview clustering using feature and relational data. Appl. Soft Comput. 108, 107425 (2021)
Liu, Y., Özyer, T., Alhajj, R., Barker, K.: Integrating multi-objective genetic algorithm and validity analysis for locating and ranking alternative clustering. Informatica 29, 33–40 (2005)
MacQueen, J.: Classification and analysis of multivariate observations. In: 5th Berkeley Symposium Mathematics Statistics Probability, pp. 281–297 (1967)
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297. University of California Press (1967)
Malinen, M.I., Fränti, P.: Balanced k-means for clustering. In: Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), pp. 32–41. Springer (2014)
Martínez-Peñaloza, M.-G., Mezura-Montes, E., Cruz-Ramírez, N., Acosta-Mesa, H.-G., Ríos-Figueroa, H.-V.: Improved multi-objective clustering with automatic determination of the number of clusters. Neural Comput. Appl. 28, 2255–2275 (2017)
Matake, N., Hiroyasu, T., Miki, M., Senda, T.: Multiobjective clustering with automatic K-determination for large-scale data. In: Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation, GECCO ’07, pp. 861–868. ACM, London (2007)
Mukhopadhyay, A., Maulik, U., Bandyopadhyay, S.: Multiobjective genetic algorithm-based fuzzy clustering of categorical attributes. IEEE Trans. Evol. Comput. 13(5), 991–1005 (2009)
Mukhopadhyay, A., Maulik, U., Bandyopadhyay, S.: Multiobjective genetic clustering with ensemble among pareto front solutions: application to MRI brain image segmentation. In: 2009 Seventh International Conference on Advances in Pattern Recognition, pp. 236–239 (2009)
Mukhopadhyay, A., Maulik, U., Bandyopadhyay, S.: A survey of multiobjective evolutionary clustering. ACM Comput. Surv. 47(4) (2015)
Newman, M.E.J.: Modularity and community structure in networks. Proc. Natl. Acad. Sci. 103(23), 8577–8582 (2006)
Osaba, E., Del Ser, J., Camacho, D., Bilbao, M.N., Yang, X.S.: Community detection in networks using bio-inspired optimization: latest developments, new results and perspectives with a selection of recent meta-heuristics. Appl. Soft Comput. 87, 106010 (2020)
Özyer, T., Liu, Y., Alhajj, R., Barker, K.: Multi-objective genetic algorithm based clustering approach and its application to gene expression data. In: Yakhno, T. (ed.) Advances in Information Systems, pp. 451–461. Springer, Berlin (2005)
Park, Y.J., Song, M.S.: A genetic algorithm for clustering problems. In: Genetic Programming, pp. 568–575. Morgan Kaufmann, Madison (1998)
Pizzuti, C.: Ga-net: a genetic algorithm for community detection in social networks. In: International Conference on Parallel Problem Solving from Nature, pp. 1081–1090. Springer (2008)
Pizzuti, C.: Evolutionary computation for community detection in networks: a review. IEEE Trans. Evol. Comput. 22(3), 464–483 (2018)
Qian, X., Zhang, X., Jiao, L., Ma, W.: Unsupervised texture image segmentation using multiobjective evolutionary clustering ensemble algorithm. In: 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence), pp. 3561–3567 (2008)
Rothlauf, F.: Representations for genetic and evolutionary algorithms, 2nd edn. Springer, Berlin (2006)
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)
Saha, S., Mitra, S., Kramer, S.: Exploring multiobjective optimization for multiview clustering. ACM Trans. Knowl. Discov. Data 12(4), 44:1–44:30 (2018)
Saxena, A., Prasad, M., Gupta, A., Bharill, N., Patel, O.P., Tiwari, A., Er, M.J., Ding, W., Lin, C.T.: A review of clustering techniques and developments. Neurocomputing 267, 664–681 (2017)
Shirakawa, S., Nagao, T.: Evolutionary image segmentation based on multiobjective clustering. In: IEEE Congress on Evolutionary Computation, pp. 2466–2473 (2009)
Shukla, P.K., Braun, M.A., Schmeck, H.: Theory and algorithms for finding knees. In: International Conference on Evolutionary Multi-Criterion Optimization, pp. 156–170. Springer (2013)
Talbi, E.G.: Metaheuristics from design to implementation. Wiley (2009)
Thorndike, R.L.: Who belongs in the family. In: Psychometrika. Citeseer (1953)
Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in a data set via the gap statistic. J. R. Stat. Soc.: Ser. B (Stat. Methodol.) 63(2), 411–423 (2001)
Wahid, A., Gao, X., Andreae, P.: Multi-view clustering of web documents using multi-objective genetic algorithm. In: 2014 IEEE Congress on Evolutionary Computation (CEC), pp. 2625–2632. IEEE, Beijing (2014)
Zhang, Q., Li, H.: MOEA/D: A multiobjective evolutionary algorithm based on decomposition. IEEE Trans. Evol. Comput. 11(6), 712–731 (2007)
Zhou, K., Martin, A., Pan, Q.: A similarity-based community detection method with multiple prototype representation. Phys. A 438, 519–531 (2015)
Zhu, S., Lihong, X., Goodman, E.D.: Evolutionary multi-objective automatic clustering enhanced with quality metrics and ensemble strategy. Knowl.-Based Syst. 188, 105018 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Handl, J., Garza-Fabre, M., José-García, A. (2024). Evolutionary Clustering and Community Detection. In: Banzhaf, W., Machado, P., Zhang, M. (eds) Handbook of Evolutionary Machine Learning. Genetic and Evolutionary Computation. Springer, Singapore. https://doi.org/10.1007/978-981-99-3814-8_6
Download citation
DOI: https://doi.org/10.1007/978-981-99-3814-8_6
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-3813-1
Online ISBN: 978-981-99-3814-8
eBook Packages: Computer ScienceComputer Science (R0)