Abstract
The classic computational scheme of convolutional layers leverages filter banks that are shared over all the spatial coordinates of the input, independently on external information on what is specifically under observation and without any distinctions between what is closer to the observed area and what is peripheral. In this paper we propose to go beyond such a scheme, introducing the notion of Foveated Convolutional Layer (FCL), that formalizes the idea of location-dependent convolutions with foveated processing, i.e., fine-grained processing in a given-focused area and coarser processing in the peripheral regions. We show how the idea of foveated computations can be exploited not only as a filtering mechanism, but also as a mean to speed-up inference with respect to classic convolutional layers, allowing the user to select the appropriate trade-off between level of detail and computational burden. FCLs can be stacked into neural architectures and we evaluate them in several tasks, showing how they efficiently handle the information in the peripheral regions, eventually avoiding the development of misleading biases. When integrated with a model of human attention, FCL-based networks naturally implement a foveated visual system that guides the attention toward the locations of interest, as we experimentally analyze on a stream of visual stimuli.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
In our experiments we used an exponential law, with \(\hat{\sigma }_{a}\) almost zero and \(\hat{\sigma }_{A}=10\). Function g is computed on a discrete grid of fixed-size \(7 \times 7\).
- 2.
The innermost region \(\mathcal {R}_1\) is then a circle, and the other regions are circular crowns with increasing radii. The outermost region \(\mathcal {R}_R\) is simply the complementary area. We also tested the case of a squared \(\mathcal {R}_1\) and frame-like \(\mathcal {R}_i\), \(i>1\).
- 3.
- 4.
We measured the number of floating point operations using the PyTorch profiling utilities https://pytorch.org/docs/stable/profiler.html.
References
Almahairi, A., Ballas, N., Cooijmans, T., Zheng, Y., Larochelle, H., Courville, A.: Dynamic capacity networks. In: ICML, pp. 2549–2558. PMLR (2016)
Borji, A., Cheng, M.-M., Hou, Q., Jiang, H., Li, J.: Salient object detection: a survey. Comput. Vis. Media 5(2), 117–150 (2019). https://doi.org/10.1007/s41095-019-0149-9
Delange, M., et al.: A continual learning survey: defying forgetting in classification tasks. IEEE TPAMI 44(7), 3366–3385 (2021)
Deza, A., Konkle, T.: Emergent properties of foveated perceptual systems. arXiv:2006.07991 (2020)
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: ICLR (2020)
Gregor, K., LeCun, Y.: Emergence of complex-like cells in a temporal product network with local receptive fields. arXiv:1006.0448 (2010)
Han, Y., Huang, G., Song, S., Yang, L., Wang, H., Wang, Y.: Dynamic neural networks: a survey. IEEE TPAMI 44(11), 7436–7456 (2021)
Kong, T., Sun, F., Liu, H., Jiang, Y., Li, L., Shi, J.: Foveabox: beyound anchor-based object detection. IEEE TIP 29, 7389–7398 (2020)
Larochelle, H., Hinton, G.E.: Learning to combine foveal glimpses with a third-order Boltzmann machine. In: Advances in NeurIPS, vol. 23 (2010)
Luo, Y., Boix, X., Roig, G., Poggio, T., Zhao, Q.: Foveation-based mechanisms alleviate adversarial examples. arXiv:1511.06292 (2015)
Malkin, E., Deza, A., et al.: CUDA-optimized real-time rendering of a foveated visual system. In: NeurIPS 2020 Workshop SVRHM (2020)
Mnih, V., Heess, N., Graves, A., et al.: Recurrent models of visual attention. In: Advances in Neural Information Processing Systems, vol. 27 (2014)
Ott, J., Linstead, E., LaHaye, N., Baldi, P.: Learning in the machine: to share or not to share? Neural Netw. 126, 235–249 (2020)
Pang, L., Lan, Y., Xu, J., Guo, J., Cheng, X.: Locally smoothed neural networks. In: Asian Conference on Machine Learning, pp. 177–191. PMLR (2017)
Poggio, T., Mutch, J., Isik, L.: Computational role of eccentricity dependent cortical magnification. arXiv:1406.1770 (2014)
Pramod, R., Katti, H., Arun, S.: Human peripheral blur is optimal for object recognition. arXiv:1807.08476 (2018)
Reddy, M.V., Banburski, A., Pant, N., Poggio, T.: Biologically inspired mechanisms for adversarial robustness. In: Advances in NeurIPS, vol. 33 (2020)
Sagawa, S., Koh, P.W., Hashimoto, T.B., Liang, P.: Distributionally robust neural networks. In: ICLR (2019)
Tiezzi, M., Melacci, S., Betti, A., Maggini, M., Gori, M.: Focus of attention improves information transfer in visual features. In: Advances in NeurIPS, vol. 33, pp. 22194–22204 (2020)
Tolstikhin, I.O., et al.: MLP-mixer: an all-MLP architecture for vision. In: NeurIPS, vol. 34 (2021)
Trockman, A., Kolter, J.Z.: Patches are all you need? arXiv:2201.09792 (2022)
Wang, P., Cottrell, G.W.: Central and peripheral vision for scene recognition: a neurocomputational modeling exploration. J. Vis. 17(4), 9–9 (2017)
Xiao, K.Y., Engstrom, L., Ilyas, A., Madry, A.: Noise or signal: the role of image backgrounds in object recognition. In: ICLR (2020)
Yang, J., Li, C., Gao, J.: Focal modulation networks. arXiv:2203.11926 (2022)
Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions (2015). https://doi.org/10.48550/ARXIV.1511.07122
Zanca, D., Melacci, S., Gori, M.: Gravitational laws of focus of attention. IEEE TPAMI 42(12), 2983–2995 (2020)
Acknowledgements
This work was partly supported by the PRIN 2017 project RexLearn, funded by the Italian Ministry of Education, University and Research (grant no. 2017TWNMH2), and also by the French government, through the 3IA Côte d’Azur, Investment in the Future, project managed by the National Research Agency (ANR) with the reference number ANR-19-P3IA-0002.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Tiezzi, M. et al. (2023). Foveated Neural Computation. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13715. Springer, Cham. https://doi.org/10.1007/978-3-031-26409-2_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-26409-2_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26408-5
Online ISBN: 978-3-031-26409-2
eBook Packages: Computer ScienceComputer Science (R0)