Abstract
Tracker based on the Siamese network is an important research direction of target tracking. However, Siamese trackers have the problem of insufficient characteristic expression ability extensively. To improve the situation, we introduced the effective attention module multi-frequency attention, which considers the attention mechanism from the frequency domain perspective and effectively raises the efficiency of feature representation of the tracker. Meanwhile, densely connected neural networks are applied to the tracker, which integrates the surface orientation information and deep-seated semantic features about the object, which will contribute to enhancing the localization and expression ability of the tracker. The proposed method achieves excellent results on Siamese network tracking. In order to prove the effectiveness of our strategy, we conducted experiments on benchmarks, including OTB100, OTB2013, VOT2016, and VOT2018, and the obtained results indicate that strategy we proposed achieves high accuracy and efficiency and still keeps outstanding with the effect of occlusion and illumination. Experiments show that the proposed method fulfills real-time tracking.





Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The data used to support the findings of this study have been deposited in the [The Visual Object Tracking VOT2016 Challenge Results and The Sixth Visual Object Tracking] repository ([https://doi.org/10.1007/978-3-319-48881-3_54] [https://doi.org/10.1007/978-3-030-11009-3_1]).
References
Jaimes, A., Sebe, N.: Multimodal human–computer interaction: a survey. In: International Conference on Computer Vision in Human-computer Interaction. Springer, Berlin, Heidelberg, (2007)
Valera, M., Velastin, S.A.: Intelligent distributed surveillance systems: a review. IEE Proc. Vis. Image Signal Process. 152(2), 192–204 (2005)
Kamijo, S., Matsushita, Y., Ikeuchi, K., et al.: Traffic monitoring and accident detection at intersections?. In: IEEE/IEEJ/JSAI International Conference on Intelligent Transportation Systems. IEEE, (2002)
Levinson, J., Askeland, J., Becker, J., et al.: Towards fully autonomous driving: Systems and algorithms. In: IEEE, (2011)
Cleary, K., Peters, T.M., et al.: Image-guided interventions: technology review and clinical applications. Annual Rev. Biomed. Eng. 12(1), 119–142 (2010)
Chen, Y.B., Chen, O., Chang, H.T., et al.: An automatic medical-assistance diagnosis system applicable on X-ray images. Vol. 2, pp. 910–914 (2001)
Yilmaz, A., Javed, O., Shah, M.: Object tracking: a survey. ACM Comput. Surv. 38(4), 13-es (2006)
Comaniciu, D., Ramesh, V., Meer, P.: Real-time tracking of non-rigid objects using mean shift. In: IEEE. (2003)
Patel, H.A.: Moving object tracking using kalman filter. Int. J. Comput. Sci. Mobile Comput. 2(4), 326–32 (2013)
Bolme, D.S., Beveridge, J.R., Draper, B.A., et al.: Visual object tracking using adaptive correlation filters. In: The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010, San Francisco, CA, USA, pp. 13–18 June 2010. IEEE, (2010)
Henriques, J.F., Rui, C., Martins, P., et al.: Exploiting the circulant structure of tracking-by-detection with kernels. In: Proceedings of the 12th European conference on Computer Vision - Volume Part IV. Springer, Berlin, Heidelberg, (2012)
Henriques, J.F., Caseiro, R., Martins, P., et al.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2015)
Danelljan, M., Gustav, H., Khan, F.S., et al.: Accurate scale estimation for robust visual tracking. In: British Machine Vision Conference. (2014)
Abbass, M.Y., Kwon, K.C., Kim, N., et al.: Efficient object tracking using hierarchical convolutional features model and correlation filters. Vis. Comput. 37(4), 831–842 (2021)
Zolfaghari, M., Ghanei-Yakhdan, H., Yazdi, M.: Real-time object tracking based on an adaptive transition model and extended Kalman filter to handle full occlusion. Vis. Comput. 36(4), 701–715 (2020)
Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4293–4302 (2016)
Nam, H., Baek, M., Han, B.: Modeling and propagating CNNs in a tree structure for visual tracking. (2016)
Held, D., Thrun, S., Savarese, S.: Learning to track at 100 FPS with Deep regression networks. Springer, Cham (2016)
Zhang, Z., Peng, H.: Deeper and wider siamese networks for real-time visual tracking. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, (2020)
Yang, K., He, Z., Pei, W., et al.: SiamCorners: siamese corner networks for visual tracking. IEEE Trans. Multimed., (2021)
Danelljan, M., Hager, G., Khan, F.S. et al.: Convolutional features for correlation filter based visual tracking. In: 2015 IEEE International Conference on Computer Vision Workshop (ICCVW). IEEE, (2016)
Danelljan, M., Robinson, A., Khan, F.S., et al.: Beyond correlation filters: learning continuous convolution operators for visual tracking. Springer International Publishing (2016)
Danelljan, M., Bhat, G., Khan, F.S., et al.: ECO: efficient convolution operators for tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6638–6646 (2017)
Zhang, W., Du, Y., Chen, Z., et al.: Robust adaptive learning with Siamese network architecture for visual tracking. Vis. Comput. 37(5), 881–894 (2021)
Guo, D., Wang, J., Cui, Y., et al.: SiamCAR: siamese fully convolutional classification and regression for visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6269–6277 (2020)
Chen, Z., Zhong, B., Li, G., et al.: Siamese box adaptive network for visual tracking. (2020)
Qin, Z., Zhang, P., Wu, F., et al.: Fcanet: frequency channel attention networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 783–792 (2021)
Jie, H., Li, S., Gang, S., et al.: Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell., PP(99). (2017)
Woo, S., Park, J., Lee, J.Y., et al.: CBAM: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3-19 (2018)
He, A., Luo, C., Tian, X., et al.: A twofold siamese network for real-time object tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4834–4843 (2018)
Wang, Q., Wu, B., Zhu, P., et al.: ECA-Net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 11534–11542 (2020)
Li, B., Wu, W., Wang, Q., et al.: SiamRPN++: evolution of siamese visual tracking with very deep networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4282–4291 (2019)
Li, Y., Zhang, X.: SiamVGG: visual tracking using deeper siamese networks (2019) arXiv preprint arXiv:1902.02804
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. IEEE, (2016)
Huang, G., Liu, Z., Laurens, V., et al.: Densely connected convolutional networks. IEEE Comput. Soc., (2016)
Lianghua, H., Xin, et al.: GOT-10k: a large high-diversity benchmark for generic object tracking in the wild. IEEE Trans. Pattern Anal. Mach. Intell., (2019)
Wu, Y., Lim, J., Yang, M.H.: Object tracking benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1834–1848 (2015)
Zhong, W., Lu, H., Yang, M.-H.: Robust object tracking via sparsity-based collaborative model. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp. 1838-1845 (2012)
Hare, S., Golodetz, S., Saffari, A., et al.: Struck: structured output tracking with kernels. IEEE Trans. Pattern Anal. Mach. Intell. 38(10):2096–2109 (2015)
Bertinetto, L., Valmadre, J., Henriques, J.F., et al.: Fully-convolutional siamese networks for object tracking. Springer, Cham (2016)
Wu, Y., Lim, J., Yang, M.H.: Online object tracking: A benchmark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2411–2418 (2013)
Danelljan, M., Hager, G., Shahbaz Khan, F., et al.: Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4310–4318 (2015)
Danelljan, M., Hager, G., et al.: Discriminative scale space tracking. IEEE Trans. Pattern Anal. Mach. Intell. 39(8), 1561–1575 (2017)
Bertinetto, L., Valmadre, J., Golodetz, S., et al.: Staple: Complementary learners for real-time tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1401–1409 (2016)
Valmadre, J., Bertinetto, L., Henriques, J., et al.: End-to-end representation learning for correlation filter based tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2805–2813 (2017)
Guo, Q., Wei, F., Zhou C , et al. Learning dynamic siamese network for visual object tracking. In: 2017 IEEE International Conference on Computer Vision (ICCV). IEEE Computer Society, (2017)
Wang, Q., Gao, J., Xing, J., et al.: DCFNet: discriminant correlation filters network for visual tracking. (2017)
Chu, L., Li, H.: Regressive scale estimation for visual tracking. In: 2019 IEEE International Conference on Industrial Technology (ICIT). IEEE, pp. 893–898 (2019)
Danelljan, M., Häger, G., Khan, F., et al.: Accurate scale estimation for robust visual tracking. In: British Machine Vision Conference, Nottingham, September 1–5, 2014. Bmva Press, (2014)
Li, Y., Zhu, J.: A scale adaptive kernel correlation filter tracker with feature integration. In: European conference on computer vision. Springer, Cham, pp. 254–265. (2014)
Pu, S., Song, Y., Ma, C., et al.: Deep attentive tracking via reciprocative learning. Adv. Neural Inf. Process. Syst., 31. (2018)
Danelljan, M., Hager, G., Shahbaz Khan, F., et al.: Convolutional features for correlation filter based visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 58–66 (2015)
Funding
Funding was provided by National Key Research and Development Program of China. Grant Numbers (2020YFB1712401, 2018******4402).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Pang, H., Han, L., Liu, C. et al. Siamese object tracking based on multi-frequency enhancement feature. Vis Comput 40, 261–271 (2024). https://doi.org/10.1007/s00371-023-02779-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-023-02779-0