Abstract
The neural network research field is still producing novel and improved models which continuously outperform their predecessors. However, a large portion of the best-performing architectures are still fully hand-engineered by experts. Recently, methods that automatize the search for optimal structures have started to reach the level of state-of-the-art hand-crafted structures. Nevertheless, replacing the expert knowledge requires high efficiency from the search algorithm, and flexibility on the part of the model concept. This work proposes a set of model structure-modifying operators designed specifically for the VALP, a recently introduced multi-network model for heterogeneous multi-task problems. These modifiers are employed in a greedy multi-objective search algorithm which employs a non dominance-based acceptance criterion in order to test the viability of a structure-exploring method built on the operators. The results obtained from the experiments carried out in this work indicate that the modifiers can indeed form part of intelligent searches over the space of VALP structures, which encourages more research in this direction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chen, T., Goodfellow, I., Shlens, J.: Net2net: accelerating learning via knowledge transfer (2015). arXiv preprint arXiv:1511.05641
Elsken, T., Metzen, J.-H., Hutter, F.: Simple and efficient architecture search for convolutional neural networks (2017). arXiv preprint arXiv:1711.04528
Fernando, C., et al.: Convolution by evolution: differentiable pattern producing networks. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016, pp. 109–116. ACM (2016)
Fonseca, C.M., Paquete, L., López-Ibáñez, M.: An improved dimension-sweep algorithm for the hypervolume indicator. In: 2006 IEEE International Conference on Evolutionary Computation, pp. 1157–1163. IEEE (2006)
Garciarena, U., Mendiburu, A., Santana, R.: Towards automatic construction of multi-network models for heterogeneous multi-task learning (2019). arXiv preprint arXiv:1903.09171
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Howard, A.G., et al.: Efficient convolutional neural networks for mobile vision applications (2017). arXiv preprint arXiv:1704.04861
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes (2013). arXiv preprint arXiv:1312.6114
Kruskal, W.H., Wallis, W.A.: Use of ranks in one-criterion variance analysis. J. Am. Stat. Assoc. 47(260), 583–621 (1952)
Liang, J., Meyerson, E., Miikkulainen, R.: Evolutionary architecture search for deep multitask networks. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2018, pp. 466–473. ACM, New York (2018)
Miikkulainen, R., et al.: Evolving deep neural networks (2017). arXiv preprint arXiv:1703.00548
Rawal, A., Miikkulainen, R.: Evolving deep LSTM-based memory networks using an information maximization objective. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016, pp. 501–508. ACM (2016)
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training gans. In: Advances in Neural Information Processing Systems, pp. 2234–2242 (2016)
Stanley, K.O.: Compositional pattern producing networks: a novel abstraction of development. Genet. Program. Evol. Mach. 8(2), 131–162 (2007)
Stanley, K.O., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evol. Comput. 10(2), 99–127 (2002)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Wei, T., Wang, C., Rui, Y., Chen, C.W.: Network morphism. In: International Conference on Machine Learning, pp. 564–572 (2016)
Wu, Z., Rajendran, S., van As, T., Zimmermann, J., Badrinarayanan, V., Rabinovich, A.: EyeNet: A Multi-Task Network for Off-Axis Eye Gaze Estimation and User Understanding (2019). arXiv preprint arXiv:1908.09060
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms, August 2017. arXiv: cs.LG/1708.07747
Acknowledgments
This work has been supported by the TIN2016-78365-R (Spanish Ministry of Economy, Industry and Competitiveness) and the IT-1244-19 (Basque Government) programs http://www.mineco.gob.es/portal/site/mineco. Unai Garciarena also holds a predoctoral grant (ref. PIF16/238) by the University of the Basque Country.
We also gratefully acknowledge the support of NVIDIA Corporation with the donation of a Titan X Pascal GPU used to accelerate the process of training the models used in this work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Garciarena, U., Mendiburu, A., Santana, R. (2020). Automatic Structural Search for Multi-task Learning VALPs. In: Dorronsoro, B., Ruiz, P., de la Torre, J., Urda, D., Talbi, EG. (eds) Optimization and Learning. OLA 2020. Communications in Computer and Information Science, vol 1173. Springer, Cham. https://doi.org/10.1007/978-3-030-41913-4_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-41913-4_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41912-7
Online ISBN: 978-3-030-41913-4
eBook Packages: Computer ScienceComputer Science (R0)