Video object segmentation guided refinement on foreground-background objects

Devi, J. Sarala; Sulthana, A. Razia

doi:10.1007/s11042-022-12981-2

Video object segmentation guided refinement on foreground-background objects

Published: 09 August 2022

Volume 82, pages 6769–6785, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Video Object Segmentation (VOS) for separating a foreground object from a video sequence is an intricate task and relies on fine-tuning. Many recent approaches focus on pixel-wise matching of foreground objects and gives importance to balancing the relation between the pixels for identifying the foreground objects and might lead to misclassification. This paper explores mapping between the foreground and background objects in semi-supervised VOS by balancing and mutually mapping the pixels between the foreground and background objects. The proposed model makes practical and effective use of enhanced pixel and instance level matching to improve the prediction. Moreover, the framework implements ensemble learning with a Leaky-ReLU activation function that improves the segmentation process. To evaluate the results of object segmentation process, J and F scores are measured. We carry experiments broadly on popular benchmark DAVIS, in the versions 2016 and 2017. Our Model achieves a promising performance of J & F score of 82%, surpassing all the other techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Collaborative Video Object Segmentation by Foreground-Background Integration

Dual Attention Based Network with Hierarchical ConvLSTM for Video Object Segmentation

PReMVOS: Proposal-Generation, Refinement and Merging for Video Object Segmentation

Code Availability

The relevant code towards the work are available with authors

References

Bao L, Wu B, Liu W (2018) CNN in MRF Video object segmentation via inference in a CNN-based higher-order spatio-temporal MRF. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5977–5986
Caelles S, Maninis K-K, Pont-Tuset J, Leal-Taixé L, Cremers D, Gool LV (2017) One-shot video object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 221–230
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille Al (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article Google Scholar
Chen Y, Pont-Tuset J, Montes A, Gool LV (2018) Blazingly fast video object segmentation with pixel-wise metric learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1189–1198
Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 801–818
Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 801–818
Cheng H-T, Chao C-H, Dong J-D, Wen H-K, Liu T-L, Sun M (2018) Cube padding for weakly-supervised saliency prediction in 360 videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1420–1429
Cheng J, Tsai Y-H, Hung W-C, Wang S, Yang M-H (2018) Fast and accurate online video object segmentation via tracking parts. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7415–7424
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255
Favorskaya MN, Andreev VV (2019) The study of activation functions in deep learning for pedestrian detection and tracking. International Archives of the Photogrammetry, Remote Sensing & Spatial Information Sciences
Glorot X, Bengio Y (2018) Understanding the difficulty of training deep feedforward neural networks. 2010. In: International Conference on Artificial Intelligence and Statistics
Griffin BA, Corso JJ (2019) Bubblenets: Learning to select the guidance frame in video object segmentation by deep sorting frames. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8914–8923
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hu Y-T, Huang J-B, Schwing A (2017) Maskrnn: Instance level video object segmentation. Adv Neural Inf Process Syst 30:325–334
Google Scholar
Hu Y-T, Huang J-B, Schwing AG (2018) Videomatch: Matching based video object segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 54–70
Hu J, Li S, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167
Johnander J, Danelljan M, Brissman E, Khan FS, Felsberg M (2019) A generative appearance model for end-to-end video object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8953–8962
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Article Google Scholar
Li X, Loy CC (2018) Video object segmentation with joint re-identification and attention-aware mask propagation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 90–105
Liang Y, He F, Zeng X (2020) 3D mesh simplification with feature preservation based on Whale Optimization Algorithm and Differential Evolution. Integrated Computer-Aided Engineering Preprint, pp 1–19
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: European conference on computer vision, pp 740–755. Springer, Cham
Luiten J, Voigtlaender P, Leibe B (2018) Premvos: Proposal-generation, refinement and merging for video object segmentation. In: Asian Conference on Computer Vision. Springer, Cham, pp 565–580
Oh SW, Lee J-Y, Xu N, Kim SJ (2019) Video object segmentation using space-time memory networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp 9226–9235
Pan Y, He F, Yu H (2020) Learning social representations with deep autoencoder for recommender system. World Wide Web 23(4):2259–2279
Perazzi F, Khoreva A, Benenson R, Schiele B, Sorkine-Hornung A (2017) Learning video object segmentation from static images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2663–2672
Perazzi F, Pont-Tuset J, McWilliams B, Gool LV, Gross M, Sorkine-Hornung A (2016) A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 724–732
Pont-Tuset J, Perazzi F, Caelles S, Arbeláez P, Sorkine-Hornung A, Van Gool L (2017) The 2017 davis challenge on video object segmentation. arXiv:1704.00675
Ventura C, Bellver M, Girbau A, Salvador A, Marques F, Giro-i-Nieto X (2019) Rvos: End-to-end recurrent network for video object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5277–5286
Visin F, Ciccone M, Romero A, Kastner K, Cho K, Bengio Y, Matteucci M, Courville A (2016) Reseg: A recurrent neural network-based model for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 41–48
Voigtlaender P, Chai Y, Schroff F, Adam H, Leibe B, Chen L-C (2019) Feelvos: Fast end-to-end embedding learning for video object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 9481–9490
Voigtlaender P, Leibe B (2017) Online adaptation of convolutional neural networks for video object segmentation. arXiv:1706.09364
Wang W, Song H, Zhao S, Shen J, Zhao S, Hoi SCH, Ling H (2019) Learning unsupervised video object segmentation through visual attention. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3064–3074
Wang L, Wang Y, Liang Z, Lin Z, Yang J, An W, Guo Y (2019) Learning parallax attention for stereo image Super-Resolution. Inproceedings of the IEEE conference on computer vision and pattern recognition (CVPR), Long Beach, pp 16–20
Wang Q, Zhang L, Bertinetto L, Hu W, Torr PHS (2019) Fast online object tracking and segmentation: A unifying approach. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1328–1338
Wolf CT (2016) DIY videos on YouTube: Identity and possibility in the age of algorithms. First Monday
Wu Z, Shen C, Hengel Avd (2016) Bridging category level and instance-level semantic image segmentation. arXiv:1605.06885
Wug O, Seoung J-YL, Sunkavalli K, Kim SJ (2018) Fast video object segmentation by reference-guided mask propagation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7376–7385
Xiao H, Feng J, Lin G, Yu L, Zhang M (2018) Monet: Deep motion exploitation for video object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1140–1148
Xu B, Wang N, Chen T, Li M (2015) Empirical evaluation of rectified activations in convolutional network. The International Conference on Machine Learning, arXiv:https://arxiv.org/abs/1505.00853 (8 January 2019)
Xu N, Yang L, Fan Y, Yang J, Yue D, Liang Y, Price B, Cohen S, Huang T (2018) Youtube-vos: Sequence-to-sequence video object segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 585–601
Yang Z, Wang Q, Bertinetto L, Hu W, Bai S, Torr PHS (2019) Anchor diffusion for unsupervised video object segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 931–940
Yang L, Wang Y, Xiong X, Yang J, Aggelos K (2018) Katsaggelos: Efficient video object segmentation via network modulation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6499–6507
Yu H, He F, Pan Y (2019) A novel segmentation model for medical images with intensity inhomogeneity based on adaptive perturbation. Multimed Tools Appl 78(9):11779–11798
Zhao A, Balakrishnan G, Durand F, Guttag JV, Dalca AV (2019) Data augmentation using learned transformations for one-shot medical image segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8543–8553
Zhu X, Dai J, Zhu X, Wei Y, Yuan L (2018) Towards high performance video object detection for mobiles. arXiv:1804.05830
Zhu X, Wang Y, Dai J, Yuan L, Wei Y (2017) Flow-guided feature aggregation for video object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp 408–417. Wang, Qiang, Li Zhang, Luca Bertinetto, Weiming Hu, and Philip HS Torr. Fast online object tracking and segmentation: A unifying approach. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1328–1338 (2019)

Download references

Funding

NA

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Birla Institute of Technology and Science, Pilani Dubai, UAE
J. Sarala Devi & A. Razia Sulthana

Authors

J. Sarala Devi
View author publications
You can also search for this author in PubMed Google Scholar
A. Razia Sulthana
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Both the authors have equally contributed to the work

Corresponding author

Correspondence to A. Razia Sulthana.

Ethics declarations

Ethics Approval

The manuscript has not been submitted to any other journal nor a part of the work is published anywhere

Consent for Publication

The authors would be ready for publication if this article is accepted by journal

Conflicts of Interests

There is no conflict of interest with any firm or person.

Additional information

Availability of Data And Material

The relevant data and material towards the work are available with authors

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Devi, J.S., Sulthana, A.R. Video object segmentation guided refinement on foreground-background objects. Multimed Tools Appl 82, 6769–6785 (2023). https://doi.org/10.1007/s11042-022-12981-2

Download citation

Received: 08 February 2021
Revised: 19 May 2021
Accepted: 27 March 2022
Published: 09 August 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s11042-022-12981-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Video object segmentation guided refinement on foreground-background objects

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Collaborative Video Object Segmentation by Foreground-Background Integration

Dual Attention Based Network with Hierarchical ConvLSTM for Video Object Segmentation

PReMVOS: Proposal-Generation, Refinement and Merging for Video Object Segmentation

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics Approval

Consent for Publication

Conflicts of Interests

Additional information

Availability of Data And Material

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Video object segmentation guided refinement on foreground-background objects

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Collaborative Video Object Segmentation by Foreground-Background Integration

Dual Attention Based Network with Hierarchical ConvLSTM for Video Object Segmentation

PReMVOS: Proposal-Generation, Refinement and Merging for Video Object Segmentation

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics Approval

Consent for Publication

Conflicts of Interests

Additional information

Availability of Data And Material

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.