Abstract
In data mining, instance reduction is a key data pre-processing step that simplifies and cleans raw data, by either selecting or creating new samples, before applying a learning algorithm. This usually yields to a complex large scale and computationally expensive optimisation problem which has been typically tackled by sophisticated population-based metaheuristics. Unlike the recent literature, in order to accomplish this target, this article proposes the use of a simple local search algorithm and its integration with an optional surrogate assisted model. This local search, in accordance with variable decomposition techniques for large scale problems, perturbs an n-dimensional vector along the directions identified by its design variables one by one.
Empirical results in 40 small data sets show that, despite its simplicity, the proposed baseline local search on its own is competitive with more complex algorithms representing the state-of-the-art for instance reduction in classification problems. The use of the proposed local surrogate model enables a reduction of the computationally expensive objective function calls with accuracy test results overall comparable with respect to its baseline counterpart.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cano, J.R., Herrera, F., Lozano, M.: Using evolutionary algorithms as instance selection for data reduction in KDD: an experimental study. IEEE Trans. Evol. Comput. 7(6), 561–575 (2003)
Caraffini, F., Neri, F., Iacca, G.: Large scale problems in practice: the effect of dimensionality on the interaction among variables. In: Squillero, G., Sim, K. (eds.) EvoApplications 2017. LNCS, vol. 10199, pp. 636–652. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55849-3_41
Caraffini, F., Neri, F., Iacca, G., Mol, A.: Parallel memetic structures. Inf. Sci. 227, 60–82 (2013)
Caraffini, F., Neri, F., Picinali, L.: An analysis on separability for memetic computing automatic design. Inf. Sci. 265, 1–22 (2014)
Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)
Dhar, V.: Data science and prediction. Commun. ACM 56(12), 64–73 (2013)
GarcÃa, S., Derrac, J., Cano, J., Herrera, F.: Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans. Pattern Anal. Mach. Intell. 34(3), 417–435 (2012)
GarcÃa, S., Cano, J.R., Herrera, F.: A memetic algorithm for evolutionary prototype selection: a scaling up approach. Pattern Recogn. 41(8), 2693–2709 (2008)
GarcÃa-Pedrajas, N., de Haro-GarcÃa, A., Pérez-RodrÃguez, J.: A scalable memetic algorithm for simultaneous instance and feature selection. Evol. Comput. 22(1), 1–45 (2014)
Hidalgo, B., Goodman, M.: Multivariate or multivariable regression? Am. J. Public Health 103, 39–40 (2013)
Jin, Y.: Surrogate-assisted evolutionary computation: recent advances and future challenges. Swarm Evol. Comput. 1(2), 61–70 (2011)
Jobson, J.D.: Multiple linear regression. In: Jobson, J.D. (ed.) Applied Multivariate Data Analysis. STS, pp. 219–398. Springer, New York (1991). https://doi.org/10.1007/978-1-4612-0955-3_4
Krasnogor, N.: Towards robust memetic algorithms. In: Hart, W.E., Krasnogor, N., Smith, J.E. (eds.) Recent Advances in Memetic Algorithms. STUDFUZZ, vol. 166, pp. 185–207. Springer, Berlin (2004). https://doi.org/10.1007/3-540-32363-5_9
Krawczyk, B., Triguero, I., GarcÃa, S., Woźniak, M., Herrera, F.: Instance reduction for one-class classification. Knowl. Inf. Syst. 59(3), 601–628 (2018). https://doi.org/10.1007/s10115-018-1220-z
Li, X., Yao, X.: Cooperatively coevolving particle swarms for large scale optimization. IEEE Trans. Evol. Comput. 16(2), 210–224 (2012)
Lim, D., Jin, Y., Ong, Y.S., Sendhoff, B.: Generalizing surrogate-assisted evolutionary computation. IEEE Trans. Evol. Comput. 14(3), 329–355 (2010)
Lin, S.F., Cheng, Y.C.: A separability detection approach to cooperative particle swarm optimization. In: Proceedings of the International Conference on Natural Computation, pp. 1141–1145 (2011)
Nanni, L., Lumini, A.: Particle swarm optimization for prototype reduction. Neurocomputing 72(4–6), 1092–1097 (2008)
Neri, F., del Toro Garcia, X., Cascella, G.L., Salvatore, N.: Surrogate assisted local search on PMSM drive design. COMPEL: Int. J. Comput. Math. Electr. Electron. Eng. 27(3), 573–592 (2008)
Nguyen, P.T.H., Sudholt, D.: Memetic algorithms beat evolutionary algorithms on the class of hurdle problems. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2018, pp. 1071–1078. ACM (2018)
Resende, M.G.C., Ribeiro, C.C.: Local search. In: Resende, M.G.C., Ribeiro, C.C. (eds.) Optimization by GRASP, pp. 63–93. Springer, New York (2016). https://doi.org/10.1007/978-1-4939-6530-4_4
Ong, Y.S., Nair, P.B., Lum, K.Y.: Max-min surrogate-assisted evolutionary algorithm for robust design. IEEE Trans. Evol. Comp. 10(4), 392–404 (2006)
Regis, R.G.: Surrogate-assisted particle swarm with local search for expensive constrained optimization. In: Korošec, P., Melab, N., Talbi, E.-G. (eds.) BIOMA 2018. LNCS, vol. 10835, pp. 246–257. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91641-5_21
Ros, R., Hansen, N.: A simple modification in CMA-ES achieving linear time and space complexity. In: Rudolph, G., Jansen, T., Beume, N., Lucas, S., Poloni, C. (eds.) PPSN 2008. LNCS, vol. 5199, pp. 296–305. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87700-4_30
Pilato, C., Loiacono, D., Tumeo, A., Ferrandi, F., Lanzi, P.L., Sciuto, D.: Speeding-up expensive evaluations in high-level synthesis using solution modeling and fitness inheritance. In: Tenne, Y., Goh, C.-K. (eds.) Computational Intelligence in Expensive Optimization Problems. ALO, vol. 2, pp. 701–723. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-10701-6_26
Tong, H., Huang, C., Liu, J., Yao, X.: Voronoi-based efficient surrogate-assisted evolutionary algorithm for very expensive problems. In: IEEE Congress on Evolutionary Computation, pp. 1996–2003 (2019)
Torczon, V.: On the convergence of pattern search algorithms. SIAM J. Optim. 7(1), 1–25 (1997)
Triguero, I., Derrac, J., GarcÃa, S., Herrera, F.: A taxonomy and experimental study on prototype generation for nearest neighbor classification. IEEE Trans. Syst. Man, Cybern.-Part C 42(1), 86–100 (2012)
Triguero, I., GarcÃa, S., Herrera, F.: IPADE: iterative prototype adjustment for nearest neighbor classification. IEEE Trans. Neural Netw. 21(12), 1984–1990 (2010)
Triguero, I., GarcÃa, S., Herrera, F.: Differential evolution for optimizing the positioning of prototypes in nearest neighbor classification. Pattern Recogn. 44(4), 901–916 (2011)
Triguero, I., Peralta, D., Bacardit, J., Garcia, S., Herrera, F.: A combined mapreduce-windowing two-level parallel scheme for evolutionary prototype generation. In: IEEE Congress on Evolutionary Computation, pp. 3036–3043 (2014)
Triguero, I., et al.: KEEL 3.0: an open source software for multi-stage analysis in data mining. Int. J. Comput. Intell. Syst. 10, 1238–1249 (2017)
Triguero, I., Peralta, D., Bacardit, J., GarcÃa, S., Herrera, F.: MRPR: a MapReduce solution for prototype reduction in big data classification. Neurocomputing 150, 331–345 (2015)
Tseng, L.Y., Chen, C.: Multiple trajectory search for large scale global optimization. In: Proceedings of the IEEE Congress on Evolutionary Computation, pp. 3052–3059 (2008)
Wang, Y., Yin, D., Yang, S., Sun, G.: Global and local surrogate-assisted differential evolution for expensive constrained optimization problems with inequality constraints. IEEE Trans. Cybern. 49(5), 1642–1656 (2019)
Zhao, S.Z., Suganthan, P.N., Das, S.: Self-adaptive differential evolution with multi-trajectory search for large-scale optimization. Soft. Comput. 15(11), 2175–2185 (2011). https://doi.org/10.1007/s00500-010-0645-4
Zhou, Z., Ong, Y.S., Lim, M.H., Lee, B.S.: Memetic algorithm using multi-surrogates for computationally expensive optimization problems. Soft. Comput. 11(10), 957–971 (2007). https://doi.org/10.1007/s00500-006-0145-8
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Neri, F., Triguero, I. (2020). A Local Search with a Surrogate Assisted Option for Instance Reduction. In: Castillo, P.A., Jiménez Laredo, J.L., Fernández de Vega, F. (eds) Applications of Evolutionary Computation. EvoApplications 2020. Lecture Notes in Computer Science(), vol 12104. Springer, Cham. https://doi.org/10.1007/978-3-030-43722-0_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-43722-0_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-43721-3
Online ISBN: 978-3-030-43722-0
eBook Packages: Computer ScienceComputer Science (R0)