Sparse spatial transformers for few-shot learning

Chen, Haoxing; Li, Huaxiong; Li, Yaohui; Chen, Chunlin

doi:10.1007/s11432-022-3700-8

Sparse spatial transformers for few-shot learning

Research Paper
Published: 13 October 2023

Volume 66, article number 210102, (2023)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

297 Accesses
3 Altmetric
Explore all metrics

Abstract

Learning from limited data is challenging because data scarcity leads to a poor generalization of the trained model. A classical global pooled representation will probably lose useful local information. Many few-shot learning methods have recently addressed this challenge using deep descriptors and learning a pixel-level metric. However, using deep descriptors as feature representations may lose image contextual information. Moreover, most of these methods independently address each class in the support set, which cannot sufficiently use discriminative information and task-specific embeddings. In this paper, we propose a novel transformer-based neural network architecture called sparse spatial transformers (SSFormers), which finds task-relevant features and suppresses task-irrelevant features. Particularly, we first divide each input image into several image patches of different sizes to obtain dense local features. These features retain contextual information while expressing local information. Then, a sparse spatial transformer layer is proposed to find spatial correspondence between the query image and the full support set to select task-relevant image patches and suppress task-irrelevant image patches. Finally, we propose using an image patch-matching module to calculate the distance between dense local representations, thus determining which category the query image belongs to in the support set. Extensive experiments on popular few-shot learning benchmarks demonstrate the superiority of our method over state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-local feature relation network for few-shot learning

Article 03 January 2022

BDLA: Bi-directional local alignment for few-shot learning

Article 21 April 2022

Improving Few-shot Learning by Spatially-aware Matching and CrossTransformer

References

Cao Z, Simon T, Wei S E, et al. Realtime multi-person 2D pose estimation using part affinity fields. In: Proceedings of Computer Vision and Pattern Recognition (CVPR), 2017. 7291–7299
Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection. In: Proceedings of International Conference on Computer Vision (ICCV), 2017. 2980–2988
Cheng G, Lang C, Han J. Holistic prototype activation for few-shot segmentation. IEEE Trans Pattern Anal Mach Intell, 2023, 45: 4650–4666
Google Scholar
Lang C, Cheng G, Tu B, et al. Learning what not to segment: a new perspective on few-shot segmentation. In: Proceedings of Computer Vision and Pattern Recognition (CVPR). 2022. 8057–8067
Cheng G, Li R M, Lang C B, et al. Task-wise attention guided part complementary learning for few-shot image classification. Sci China Inf Sci, 2021, 64: 120104
Article Google Scholar
Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of International Conference on Machine Learning (ICML), 2017. 1126–1135
Ravi S, Larochelle H. Optimization as a model for few-shot learning. In: Proceedings of International Conference on Learning Representations (ICLR), 2017
Chu W H, Li Y J, Chang J C, et al. Spot and learn: a maximum-entropy patch sampler for few-shot image classification. In: Proceedings of Computer Vision and Pattern Recognition (CVPR), 2019. 6251–6260
Vinyals O, Blundell C, Lillicrap T, et al. Matching networks for one shot learning. In: Proceedings of Neural Information Processing Systems (NeurIPS), 2016. 3630–3638
Snell J, Swersky K, Zemel R. Prototypical networks for few-shot learning. In: Proceedings of Neural Information Processing Systems (NeurIPS), 2017. 4077–4087
Sung F, Yang Y, Zhang L, et al. Learning to compare: relation network for few-shot learning. In: Proceedings of Computer Vision and Pattern Recognition (CVPR), 2018. 1199–1208
Li W, Wang L, Xu J, et al. Revisiting local descriptor based image-to-class measure for few-shot learning. In: Proceedings of Computer Vision and Pattern Recognition (CVPR), 2019. 7260–7268
Kang D, Kwon H, Min J, et al. Relational embedding for few-shot classification. In: Proceedings of International Conference on Computer Vision (ICCV), 2021. 8822–8833
Li A, Luo T, Xiang T, et al. Few-shot learning with global class representations. In: Proceedings of International Conference on Computer Vision (ICCV), 2019. 9715–9724
Chen H, Li H, Li Y, et al. Multi-scale adaptive task attention network for few-shot learning. In: Proceedings of International Conference on Pattern Recognition (ICPR), 2022
Zhang C, Cai Y, Lin G, et al. DeepEMD: few-shot image classification with differentiable earth mover’s distance and structured classifiers. In: Proceedings of Computer Vision and Pattern Recognition (CVPR), 2020. 12203–12213
Hou R, Chang H, Ma B, et al. Cross attention network for few-shot classification. In: Proceedings of Neural Information Processing Systems (NeurIPS), 2019. 4005–4016
Hao F, He F, Cheng J, et al. Collect and select: semantic alignment metric learning for few-shot learning. In: Proceedings of International Conference on Computer Vision (ICCV), 2019. 8460–8469
Li W, Xu J, Huo J, et al. Distribution consistency based covariance metric networks for few-shot learning. In: Proceedings of Association for the Advancement of Artificial Intelligence (AAAI), 2019. 8642–8649
Haghverdi L, Lun A T L, Morgan M D, et al. Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat Biotechnol, 2018, 36: 421–427
Article Google Scholar
Liu Y, Zheng T, Song J, et al. DMN4: few-shot learning via discriminative mutual nearest neighbor neural network. 2021. ArXiv:2103.08160
Jamal M A, Qi G J. Task agnostic meta-learning for few-shot learning. In: Proceedings of Computer Vision and Pattern Recognition (CVPR), 2019. 11719–11727
Li H, Dong W, Mei X, et al. LGM-Net: learning to generate matching networks for few-shot learning. In: Proceedings of International Conference on Machine Learning (ICML), 2019. 3825–3834
Ye H J, Hu H, Zhan D C, et al. Few-shot learning via embedding adaptation with set-to-set functions. In: Proceedings of Computer Vision and Pattern Recognition (CVPR), 2020. 8808–8817
Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. In: Proceedings of Neural Information Processing Systems (NeurIPS), 2017. 8088–8017
Doersch C, Gupta A, Zisserman A. CrossTransformers: spatially-aware few-shot transfer. In: Proceedings of Neural Information Processing Systems (NeurIPS), 2020. 21981–21993
Ren M, Triantafillou E, Ravi S, et al. Meta-learning for semi-supervised few-shot classification. In: Proceedings of International Conference on Learning Representations (ICLR), 2018
Bertinetto L, Henriques J F, Torr P H S, et al. Meta-learning with differentiable closed-form solvers. In: Proceedings of International Conference on Learning Representations (ICLR), 2019
Oreshkin B, Rodriguez P, Lacoste A. TADAM: task dependent adaptive metric for improved few-shot learning. In: Proceedings of Neural Information Processing Systems (NeurIPS), 2018. 719–729
Simon C, Koniusz P, Nock R, et al. Adaptive subspaces for few-shot learning. In: Proceedings of Computer Vision and Pattern Recognition (CVPR), 2020. 4136–4145
Lee K, Maji S, Ravichandran A, et al. Meta-learning with differentiable convex optimization. In: Proceedings of Computer Vision and Pattern Recognition (CVPR), 2019. 10657–10665
Kinga D, Adam J B. A method for stochastic optimization. In: Proceedings of International Conference on Learning Representations (ICLR), 2015
Paszke A, Gross S, Chintala S, et al. Automatic differentiation in PyTorch. In: Proceedings of Neural Information Processing Systems (NeurIPS) Workshop, 2017
Chen M T, Wang X G, Luo H, et al. Learning to focus: cascaded feature matching network for few-shot image recognition. Sci China Inf Sci, 2021, 64: 192105
Article Google Scholar
Lu S, Ye H J, Zhan D C. Tailoring embedding function to heterogeneous few-shot tasks by global and local feature adaptors. In: Proceedings of Association for the Advancement of Artificial Intelligence (AAAI), 2021. 8776–8783
Zhang H, Koniusz P, Jian S, et al. Rethinking class relations: absolute-relative supervised and unsupervised few-shot learning. In: Proceedings of Computer Vision and Pattern Recognition (CVPR), 2021. 9432–9441
Chen Z, Ge J, Zhan H, et al. Pareto self-supervised training for few-shot learning. In: Proceedings of Computer Vision and Pattern Recognition (CVPR), 2021. 13663–13672
Laenen S, Bertinetto L. On episodes, prototypical networks, and few-shot learning. In: Proceedings of Neural Information Processing Systems (NeurIPS), 2021
Kim J, Kim H, Kim G. Model-agnostic boundary-adversarial sampling for test-time generalization in few-shot learning. In: Proceedings of European Conference on Computer Vision (ECCV), 2020. 599–617
Dhillon G S, Chaudhari P, Ravichandran A, et al. A baseline for few-shot image classification. In: Proceedings of International Conference on Learning Representations (ICLR), 2020
Chen H, Li H, Li Y, et al. Multi-level metric learning for few-shot image recognition. In: Proceedings of International Conference on Artificial Neural Networks (ICANN), 2022
Tian Y, Wang Y, Krishnan D, et al. Rethinking few-shot image classification: a good embedding is all you need? In: Proceedings of European Conference on Computer Vision (ECCV), 2020. 266–282
Yun S, Han D, Oh S J, et al. CutMix: regularization strategy to train strong classifiers with localizable features. In: Proceedings of Computer Vision and Pattern Recognition (CVPR), 2019. 6023–6032

Download references

Acknowledgements

This work was partially supported by National Natural Science Foundation of China (Grant Nos. 62176116, 62073160, 62276136) and Natural Science Foundation of the Jiangsu Higher Education Institutions of China (Grant No. 20KJA520-006).

Author information

Authors and Affiliations

Department of Control Science and Intelligence Engineering, Nanjing University, Nanjing, 210093, China
Haoxing Chen, Huaxiong Li, Yaohui Li & Chunlin Chen

Authors

Haoxing Chen
View author publications
You can also search for this author in PubMed Google Scholar
Huaxiong Li
View author publications
You can also search for this author in PubMed Google Scholar
Yaohui Li
View author publications
You can also search for this author in PubMed Google Scholar
Chunlin Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huaxiong Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, H., Li, H., Li, Y. et al. Sparse spatial transformers for few-shot learning. Sci. China Inf. Sci. 66, 210102 (2023). https://doi.org/10.1007/s11432-022-3700-8

Download citation

Received: 28 June 2022
Revised: 04 November 2022
Accepted: 14 January 2023
Published: 13 October 2023
DOI: https://doi.org/10.1007/s11432-022-3700-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sparse spatial transformers for few-shot learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-local feature relation network for few-shot learning

BDLA: Bi-directional local alignment for few-shot learning

Improving Few-shot Learning by Spatially-aware Matching and CrossTransformer

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Sparse spatial transformers for few-shot learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-local feature relation network for few-shot learning

BDLA: Bi-directional local alignment for few-shot learning

Improving Few-shot Learning by Spatially-aware Matching and CrossTransformer

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.