Abstract
Cluster analysis is an important task in data mining and refers to group a set of objects such that the similarities among objects within the same group are maximal while similarities among objects from different groups are minimal. The particle swarm optimization algorithm (PSO) is one of the famous metaheuristic optimization algorithms, which has been successfully applied to solve the clustering problem. However, it has two major shortcomings. The PSO algorithm converges rapidly during the initial stages of the search process, but near global optimum, the convergence speed will become very slow. Moreover, it may get trapped in local optimum if the global best and local best values are equal to the particle’s position over a certain number of iterations. In this paper we hybridized the PSO with a heuristic search algorithm to overcome the shortcomings of the PSO algorithm. In the proposed algorithm, called PSOHS, the particle swarm optimization is used to produce an initial solution to the clustering problem and then a heuristic search algorithm is applied to improve the quality of this solution by searching around it. The superiority of the proposed PSOHS clustering method, as compared to other popular methods for clustering problem is established for seven benchmark and real datasets including Iris, Wine, Crude Oil, Cancer, CMC, Glass and Vowel.


Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Abul Hasan MJ, Ramakrishnan S (2011) A survey: hybrid evolutionary algorithms for cluster analysis. Artif Intell Rev 36:179–204
Acampora G, Gaeta M, Loia V (2010) Exploring e-learning knowledge through ontological memetic agents. Comput Intell 5:66–77
Acampora G, Gaeta M, Loia V (2011) Combining multi-agent paradigm and memetic computing for personalized and adaptive learning experiences. Comput Intell 27:141–165
Anaya-Sánchez H, Pons-Porrata A, Berlanga-Llavori R (2010) A document clustering algorithm for discovering and describing topics. Pattern Recognit Lett 31:502–510
Blake CL, Merz CJ UCI repository of machine learning databases. http://www.ics.uci.edu/-mlearn/MLRepository.html
Ching-Yi C, Fun Y (2004) Particle swarm optimization algorithm and its application to clustering analysis. In: 2004 IEEE International Conference on Networking, Sensing and Control, vol 782, pp 789–794
Derrac J, Garcia S, Molina D, Herrera F (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evolut Comput 1:3–18
Fan J, Han M, Wang J (2009) Single point iterative weighted fuzzy C-means clustering algorithm for remote sensing image segmentation. Pattern Recognit 42:2527–2540
Fathian M, Amiri B, Maroosi A (2007) Application of honey-bee mating optimization algorithm on clustering. Appl Math Comput 190:1502–1513
Feng D, Wenkang S, Liangzhou C, Yong D, Zhenfu Z (2005) Infrared image segmentation with 2-D maximum entropy method based on particle swarm optimization (PSO). Pattern Recognit Lett 26:597–603
Forgy EW (1965) Cluster analysis of multivariate data: efficiency versus interpretability of classifications. Biometrics 21:2
Friedman M, Last M, Makover Y, Kandel A (2007) Anomaly detection in web documents using crisp and fuzzy-based cosine clustering methodology. Inf Sci 177:467–475
Gil-García R, Pons-Porrata A (2010) Dynamic hierarchical algorithms for document clustering. Pattern Recognit Lett 31:469–477
Güngr Z, Ünler A (2007) K-harmonic means data clustering with simulated annealing heuristic. Appl Math Comput 184:199–209
Guo YW, Li WD, Mileham AR, Owen GW (2009) Applications of particle swarm optimisation in integrated process planning and scheduling. Robot Comput-Integr Manuf 25:280–288
Hatamlou A (2012) In search of optimal centroids on data clustering using a binary search algorithm. Pattern Recognit Lett 33:1756–1760
Hatamlou A (2013) Black hole: a new heuristic optimization approach for data clustering. Inf Sci 222:175–184
Hatamlou A, Abdullah S, Hatamlou M (2011a) Data clustering using big bang-big crunch algorithm. CCIS 241:383–388
Hatamlou A, Abdullah S, Nezamabadi-pour H (2011b) Application of Gravitational Search Algorithm on Data Clustering, Rough Sets and Knowledge Technology. Springer, Berlin/Heidelberg
Hatamlou A, Abdullah S, Nezamabadi-pour H (2012) A combined approach for clustering based on K-means and gravitational search algorithms. Swarm Evolut Comput 6:47–52
Hruschka ER, Campello RJGB, Freitas AA, de Carvalho ACPLF (2009) A survey of evolutionary algorithms for clustering. IEEE Trans Syst Man Cybern Part C: Appl Rev 39:133–155
Han J, Kamber M (2001) Data mining: concepts and techniques. Academic Press, New York
Jain AK (2010) Data clustering: 50 years beyond K-means. Pattern Recognit Lett 31:651–666
Jin Y-X, Cheng H-Z, Zhang L (2007) New discrete method for particle swarm optimization and its application in transmission network expansion planning. Electr Power Syst Res 77:227–233
Karaboga D, Ozturk C (2011) A novel clustering approach: artificial bee colony (ABC) algorithm. Appl Soft Comput 11:652–657
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of the IEEE international conference on neural networks, 1995, vol 1944, pp 1942–1948
Kerr G, Ruskin HJ, Crane M, Doolan P (2008) Techniques for clustering gene expression data. Comput Biol Med 38:283–293
Kim K-j, Ahn H (2008) A recommender system using GA K-means clustering in an online shopping market. Expert Syst Appl 34:1200–1209
Krishna K, Murty MN (1999) Genetic K-means algorithm. IEEE Trans Syst Man Cybern Part B: Cybern 29:433–439
Kuo RJ, Chao CM, Chiu YT (2009) Application of particle swarm optimization to association rule mining. Appl Soft Comput 11:326–336
Liang F, Wang N (2007) Dynamic agglomerative clustering of gene expression profiles. Pattern Recognit Lett 28:1062–1076
Liao L, Lin T, Li B (2008) MRI brain image segmentation and bias field correction based on fast spatially constrained kernel clustering approach. Pattern Recognit Lett 29:1580–1588
Liu Y, Yi Z, Wu H, Ye M, Chen K (2008) A tabu search approach for the minimum sum-of-squares clustering problem. Inf Sci 178:2680–2704
Mahdavi M, Chehreghani MH, Abolhassani H, Forsati R (2008) Novel meta-heuristic algorithms for clustering web documents. Appl Math Comput 201:441–451
Maitra M, Chatterjee A (2008) A hybrid cooperative-comprehensive learning based PSO algorithm for image segmentation using multilevel thresholding. Expert Syst Appl 34:1341–1350
Marinakis Y, Marinaki M, Doumpos M, Zopounidis C (2009) Ant colony and particle swarm optimization for financial classification problems. Expert Syst Appl 36:10604–10611
Maulik U, Bandyopadhyay S (2000) Genetic algorithm-based clustering technique. Pattern Recognit 33:1455–1465
Montalvo I, Izquierdo J, Pérez R, Tung MM (2008) Particle swarm optimization applied to the design of water supply systems. Comput Math Appl 56:769–776
Moshtaghi M, Havens TC, Bezdek JC, Park L, Leckie C, Rajasegarar S, Keller JM, Palaniswami M (2011) Clustering ellipses for anomaly detection. Pattern Recognit 44:55–69
Niknam T, Fard ET, Ehrampoosh S, Rousta A (2011) A new hybrid imperialist competitive algorithm on data clustering. Sadhana—Acad Proc Eng Sci 36:293–315
Papa JP, Fonseca LMG, de Carvalho LAS (2010) Projections onto convex sets through particle swarm optimization and its application for remote sensing image restoration. Pattern Recognit Lett 31:1876–1886
Perez CA, Aravena CM, Vallejos JI, Estevez PA, Held CM (2010) Face and iris localization using templates designed by particle swarm optimization. Pattern Recognit Lett 31:857–868
Saglam B, Salman FS, SayIn S, Türkay M (2006) A mixed-integer programming approach to the clustering problem with an application in customer segmentation. Eur J Oper Res 173:866–879
Scheunders P (1997) A genetic c-means clustering algorithm applied to color image quantization. Pattern Recognit 30:859–866
Selim SZ, Ismail MA (1984) K-means-type algorithms: a generalized convergence theorem and characterization of local optimality. In: Pattern analysis and machine intelligence, IEEE Transactions on PAMI-6, pp 81–87
Sha DY, Hsu C-Y (2008) A new particle swarm optimization for the open shop scheduling problem. Comput Oper Res 35:3243–3261
Shelokar PS, Jayaraman VK, Kulkarni BD (2004) An ant colony approach for clustering. Analyt Chim Acta 509:187–195
Wang C-H (2009) Outlier identification and market segmentation using kernel-based clustering techniques. Expert Syst Appl 36:3744–3750
Wang X, Yang J, Teng X, Xia W, Jensen R (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recognit Lett 28:459–471
Xia Y, Feng D, Wang T, Zhao R, Zhang Y (2007) Image segmentation by clustering of spatial patterns. Pattern Recognit Lett 28:1548–1555
Yang S, Wu R, Wang M, Jiao L (2010) Evolutionary clustering based vector quantization and SPIHT coding for image compression. Pattern Recognit Lett 31:1773–1780
Yazdani D, Golyari S, Meybodi MR (2010) A new hybrid approach for data clustering. In: 5th International Symposium on Telecommunications, IST 2010, pp 914–919
Zhong W, He J, Harrison R, Tai PC, Pan Y (2007) Clustering support vector machines for protein local structure prediction. Expert Syst Appl 32:518–526
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hatamlou, A., Hatamlou, M. PSOHS: an efficient two-stage approach for data clustering. Memetic Comp. 5, 155–161 (2013). https://doi.org/10.1007/s12293-013-0110-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12293-013-0110-x