Abstract
Convolutional Neural Networks have been widely employed in a diverse range of computer vision-based applications, including image classification, object recognition, and object segmentation. Nevertheless, one weakness of such models concerns their hyperparameters’ setting, being highly specific for each particular problem. One common approach is to employ meta-heuristic optimization algorithms to find suitable sets of hyperparameters at the expense of increasing the computational burden, being unfeasible under real-time scenarios. In this paper, we address this problem by creating Convolutional Neural Networks ensembles through Single-Iteration Optimization, a fast optimization composed of only one iteration that is no more effective than a random search. Essentially, the idea is to provide the same capability offered by long-term optimizations, however, without their computational loads. The results among four well-known datasets revealed that creating one-iteration optimized ensembles provides promising results while diminishing the time to achieve them.







Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Traditional optimization methods rely on gradients and Hessians, which are computationally costly and susceptible to local optima.
Notice that this procedure is also required to perform meta-heuristic optimizations, where larger intervals may require more time for the algorithm to find suitable solutions.
Our source code is available at https://github.com/lzfelix/random_ensembles.
One can find the proposed architectures at https://github.com/lzfelix/random_ensembles/tree/master/experiments/models.
One can find the distribution’s ranges at https://github.com/lzfelix/random_ensembles/blob/master/experiments/models/model_specs.py.
Note that K stands for the number of networks considered in the ensemble.
References
Barone V, Cossi M, Tomasi J (1998) Geometry optimization of molecular structures in solution by the polarizable continuum model. J Comput Chem 19(4):404–417
Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828
Bertsekas DP (1999) Nonlinear programming. Athena Scientific
Cai Z, Fan Q, Feris RS, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: European conference on computer vision, pp 354–370. Springer
Clanuwat T, Bober-Irizar M, Kitamoto A, Lamb A, Yamamoto K, Ha D (2018) Deep learning for classical japanese literature
Cox D, Pinto N (2011) Beyond simple features: a large-scale feature search approach to unconstrained face recognition. In: Proceedings of the IEEE International Conference on Automatic Face Gesture Recognition and Workshops, pp 8–15
de Rosa GH, Papa JP, Marana NA, Scheirer W, Cox DD (2015) Fine-tuning convolutional neural networks using harmony search. In: Iberoamerican Congress on Pattern Recognition, pp 683–690. Springer
Deng L, Platt JC (2014) Ensemble deep learning for speech recognition. In: Fifteenth annual conference of the international speech communication association
Dietterich TG (2000) Ensemble methods in machine learning. In: International workshop on multiple classifier systems, pp 1–15. Springer
Fukushima K, Miyake S (1982) Neocognitron: a new algorithm for pattern recognition tolerant of deformations and shifts in position. Pattern Recognit 15(6):455–469
Geisler WS, Albrecht DG (1992) Cortical neurons: isolation of contrast gain control. Vis Res 32(8):1409–1410
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 12(10):993–1001. https://doi.org/10.1109/34.58871
Hatamlou A (2013) Black hole: a new heuristic optimization approach for data clustering. Inf Sci 222:175–184
Hubel DH, Wiesel TN (1962) Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160(1):106–154
Ju C, Bibaut A, Laan M (2018) The relative performance of ensemble methods with deep convolutional neural networks for image classification. J Appl Stat 45(15):2800–2818. https://doi.org/10.1080/02664763.2018.1441383
Kennedy J, Eberhart RC (2001) Swarm Intell. Morgan Kaufmann Publishers Inc., San Francisco, USA
Konno H, Yamazaki H (1991) Mean-absolute deviation portfolio optimization model and its applications to tokyo stock market. Manag Sci 37(5):519–531
Kotsiantis SB (2007) Supervised machine learning: a review of classification techniques. In: Proceedings of the 2007 Conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies, pp 3–24. IOS Press, Amsterdam, The Netherlands, The Netherlands
Krizhevsky A (2009) Learning multiple layers of features from tiny images. Tech Rep, Citeseer
Krizhevsky A, Sutskever I, Hinton, GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Kumar A, Kim J, Lyndon D, Fulham M, Feng D (2017) An ensemble of fine-tuned convolutional neural networks for medical image classification. IEEE J Biomed Health Inf 21(1):31–40. https://doi.org/10.1109/JBHI.2016.2635663
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Lee I, Kim D, Kang S, Lee S (2017) Ensemble deep learning for skeleton-based action recognition using temporal sliding lstm networks. In: The IEEE international conference on computer vision (ICCV)
Minetto R, Segundo MP, Sarkar S(2019) Hydra: an ensemble of convolutional neural networks for geospatial land classification. IEEE Trans Geosci Remote Sens
Papadimitriou CH (1977) The euclidean travelling salesman problem is np-complete. Theor Comput Sci 4(3):237–244
Rao SS, Rao SS (2009) Engineering optimization: theory and practice. Wiley
Rardin LR (1998) Optimization in operations research, vol 166, Prentice Hall
Schapire RE (1990) The strength of weak learnability. Mach Learn 5(2):197–227. https://doi.org/10.1023/A:1022648800760
Smirnov EA, Timoshenko DM, Andrianov SN (2014) Comparison of regularization methods for imagenet classification with deep convolutional neural networks. AASRI Procedia 6, 89 – 94 . https://doi.org/10.1016/j.aasri.2014.05.013. 2nd AASRI Conference on Computational Intelligence and Bioinformatics
Wilcoxon F (1945) Individual comparisons by ranking methods. Biomet Bull 1(6):80–83
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The authors are grateful to CNPq grants #430274/2018-1, #304315/2017-6, #307066/2017-7, and #427968/2018-6, as well as São Paulo Research Foundation (FAPESP) grants #2013/07375-0, #2014/12236-1, #2017/25908-6, #2018/21934-5, #2019/07665-4, and #2019/02205-5.
Rights and permissions
About this article
Cite this article
Ribeiro, L.C.F., Rosa, G.H.d., Rodrigues, D. et al. Convolutional neural networks ensembles through single-iteration optimization. Soft Comput 26, 3871–3882 (2022). https://doi.org/10.1007/s00500-022-06791-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-022-06791-9