A video compression-cum-classification network for classification from compressed video streams

Yadav, Sangeeta; Gulia, Preeti; Gill, Nasib Singh; Yahya, Mohammad; Shukla, Piyush Kumar; Pareek, Piyush Kumar; Shukla, Prashant Kumar

doi:10.1007/s00371-023-03242-w

A video compression-cum-classification network for classification from compressed video streams

Original article
Published: 08 March 2024

Volume 40, pages 7539–7558, (2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

365 Accesses
1 Altmetric
Explore all metrics

Abstract

Video analytics can achieve increased speed and efficiency by operating directly on the compressed video format, thereby alleviating the decoding burden on the analytics server. The encoded video streams are rich in semantic binary information and this information can be utilized more efficiently to train the classifiers. Motivated by the same notion, a deep learning-based video compression-cum-classification network has been proposed. In the proposed work, the binary-coded semantic information is extracted by using an auto encoder-based video compression component and the same fed to the MobileNetv2-based classifier for the classification of the given video streams based on their content. Using large-scale user-generated content provided by YouTube UGC dataset, it has been demonstrated that using deep neural networks for compression not only provides on-par compression results to traditional methods, it makes analytical processing of these videos faster. Video content tagging of YouTube UGC dataset has been used as the analytics task. The proposed DLVCC approach performs 10 × faster with 30 × fewer parameters than MobileNetv2 in video tagging of compressed video with no loss in accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Fig. 2

Research on Video Compression Algorithm Based on Deep Learning

Fast Video Classification with CNNs in Compressed Domain

Storing and Compressing Video into Neural Networks by Overfitting

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

Data are available on request from the first author.

References

Tran, D., Ray, J., Shou, Z., Chang, S-F., and Paluri, M., (2017). ConvNet architecture search for spatiotemporal feature learning. arXiv preprint arXiv:1708.05038. DOI: https://doi.org/10.48550/arXiv.1708.05038.
Carreira, J. and Zisserman, A.: Quo vadis, Action recognition—a new model and the kinetics dataset. IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17) (2017). DOI: https://doi.org/10.48550/arXiv.1705.07750.
Fischer, F., Forsch, C., Herglotz, C., and Kaup, A.: Analysis of neural image compression networks for machine-to-machine communication. IEEE International Conference on Image Processing (ICIP). (2021)
Liu, M-Y., Huang, X., Wang, T-C., and Mallya, A.: Generative adversarial networks for image and video synthesis: algorithms and applications. (2021) arXiv:2008.02793v2 [cs.CV]. DOI: https://doi.org/10.48550/arXiv.2008.02793.
Lu, G., Ouyang, W., Xu, D., Zhang, X., Cai C., et al.: DVC: an end-to-end deep video compression framework. (2019) arXiv: 1812.00101v3 [eess.IV].
Kim, S., Park, J. S., Bampis, C. G., Lee, J., Markey, M. K. et al.: Adversarial video compression guided by soft edge detection. (2018) arXiv:1811.10673v1 [eess.IV].
Chen, Z., Tianyu, H., Jin X., & Wu, F.: Learning for video compression. (2019) arXiv:1804.09869v2 [cs.MM].
Wu, C. Y., Singhal, N., & Krahenbhul, P.: Video compression through image interpolation. (2018) arXiv:1804.06919v1 [cs.CV] 18 Apr 2018.
Han, J., Lombardo, S., Schroers, C., & Mandt, S.: Deep probabilistic video compression. (2018) arXiv:1810.02845v1 [cs.CV].
Rippel, O., Nair, S., Lew, C., Branson, S., Anderson, A. G., et. al.: Learned video compression. (2018) arXiv:1811.06981v1 [eess.IV].
Kubiak, N., Hadfield, S.: TACTIC: Joint rate-distortion-accuracy optimisation for low bitrate compression. (2021) arXiv preprint arXiv:2109.10658. DOI: https://doi.org/10.48550/arXiv.2109.10658
Zhang, B., Wang, L., Qiao, Y., and Wang H.: Real time action recognition with enhanced motion vector CNNs. IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15) (2016).
Benbarrad, T., Eloutouate, L., Arioua, M., Elouaai, F., Laanaoui, M.D.: Impact of image compression on the performance of steel surface defect classification with a CNN. J. Sens. Actuator Netw. 10(4), 73 (2021). https://doi.org/10.3390/jsan10040073
Article Google Scholar
Sandula, P., Okade, M.: Compressed domain video zoom motion analysis utilizing CURL. Multim. Tools Appl. 81, 12759–12776 (2022)
Article Google Scholar
Tu, Z., Liu, X., Xiao, X.: A general dynamic knowledge distillation method for visual analytics. IEEE Trans. Image Process. 31, 6517–6531 (2022)
Article Google Scholar
Chang, J.-W., Javaheripi, M., Hidano, S., and Koushanfar, F.: Adversarial Attacks on Deep Learning-based Video Compression and Classification Systems. (2022) arXiv:2203.10183v1 [cs.CV]. DOI: https://doi.org/10.48550/arXiv.2203.10183.
Bidwe, R.V., et al.: Deep learning approaches for video compression: a bibliometric analysis. Big Data Cogn. Comput. 6(2), 44 (2022). https://doi.org/10.3390/bdcc6020044
Article Google Scholar
Liu, H., et al. (2022). Video super-resolution based on deep learning: a comprehensive survey. Artif. Intell. Rev..
Howard, A., Sandler, M., Chu, G., Chen, L. C., Chen, B., Tan, M., & Adam, H.: Searching for mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 1314-1324) (2019).
Huo, Y. et al.: Lightweight action recognition in compressed videos. In: Bartoli, A., Fusiello, A. (eds) Computer Vision – ECCV 2020 Workshops. ECCV 2020. Lecture Notes in Computer Science. vol 12536. Springer, Cham (2020) https://doi.org/10.1007/978-3-030-66096-3_24
Santos, S. and Almeida, J.: Faster and accurate compressed video action recognition straight from the frequency domain In 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI). Recife/Porto de Galinhas, Brazil, pp. 62–68 (2020).
Zhang, X., Shao, J., & Zhang, J.: Low-complexity deep video compression with a distributed coding architecture. (2023) https://ar5iv.org/abs/2303.11599
Chao-Yuan, W., Manzil, Z., Hexiang, H., Manmatha, R., Alexander, J. S., Philipp, K.: (2018) arXiv:1712.00636v2. https://doi.org/10.48550/arXiv.1712.00636.
Shou, Z., et al.: DMC-Net: generating discriminative motion cues for fast compressed video action recognition. IEEE/CVF Conf. Comput. Vis. Pattern Recogn. CVPR 2019, 1268–1277 (2019). https://doi.org/10.1109/CVPR.2019.00136
Article Google Scholar
Raivo, K., Haiping, L.: VideoLightFormer: lightweight action recognition using transformers. (2021) arXiv:2107.00451v1 [cs.CV] 1 Jul 2021.
Jiao, L., et al.: New generation deep learning for video object detection: a survey. IEEE Trans. Neural Netw. Learn. Syst. 33(8), 3195–3215 (2022). https://doi.org/10.1109/TNNLS.2021.3053249
Article MathSciNet Google Scholar
Kim, M.-J., Lee, Y.-L.: Object detection-based video compression. Appl. Sci. 12(9), 4525 (2022). https://doi.org/10.3390/app12094525
Article Google Scholar
Zhai, D., Zhang, X., Li, X., Xing, X., Zhou, Y., Ma, C.: Object detection methods on compressed domain videos: An overview, comparative analysis, and new directions. Measurement 207, 112371 (2023)
Article Google Scholar
Poyser, M., Abarghouei, A.-A., and Breckon, T. P.: On the impact of lossy image and video compression on the performance of deep convolutional neural network architectures. 25th International Conference on Pattern Recognition (ICPR). (2020)
Ingle, P.Y., Kim, Y.-G.: Real-time abnormal object detection for video surveillance in smart cities. Sensors 22(10), 3862 (2022). https://doi.org/10.3390/s22103862
Article Google Scholar
Muralidhara, S., Hashmi, K.A., Pagani, A., Liwicki, M., Stricker, D., Afzal, M.Z.: Attention-guided disentangled feature aggregation for video object detection. Sensors 22(21), 8583 (2022). https://doi.org/10.3390/s22218583
Article Google Scholar
Gandor, T., Nalepa, J.: First gradually, then suddenly: understanding the impact of image compression on object detection using deep learning. Sensors 22(3), 1104 (2022). https://doi.org/10.3390/s22031104
Article Google Scholar
O’Byrne, M., Sugrue, M. V., Kokaram, A.: Impact of video compression on the performance of object detection systems for surveillance applications. 2022 18th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Madrid, Spain, 2022, pp. 1–8 (2022) doi: https://doi.org/10.1109/AVSS56176.2022.9959476.
Wang, L., et al.: Temporal segment networks: Towards good practices for deep action recognition. European Conference on Computer Vision. pp 20–36 (2016).
Ma, C-Y., Chen, M-H., Kira Z., and AlRegib, G.: TS-LSTM and temporal-inception: Exploiting spatiotemporal dynamics for activity recognition. (2017) arXiv preprint arXiv:1703.10667. DOI: https://doi.org/10.48550/arXiv.1703.10667
Girdhar, R., Ramanan, D., Gupta, A., Sivic, J., Russell, B.: ActionVLAD: learning spatio-temporal aggregation for action classification. IEEE Conf. Comput. Vis. Pattern Recogn. CVPR (2017). https://doi.org/10.1109/CVPR.2017.337
Article Google Scholar
Joshi, S., Ojo, S., Yadav, S., Gulia, P., Gill, N.S., Alsberi, H., Rizwan, A., Hassan, M.M.: Object detection and classification from compressed video streams. Expert Syst. p.e1338 (2023)
Wang, Y., Inguva, S., and Adsumilli, B.: YouTube UGC dataset for video compression research. IEEE 21st International Workshop on Multimedia Signal Processing (MMSP) (2019)

Download references

Author information

Authors and Affiliations

Department of Computer Science & Applications, Maharshi Dayanand University, Rohtak, Haryana, India
Sangeeta Yadav, Preeti Gulia & Nasib Singh Gill
Oakland University, Rochester, USA
Mohammad Yahya
Department of Computer Science & Engineering, University Institute of Technology (UIT)- Rajiv Gandhi Proudyogiki Vishwavidyalaya (RGPV), Technological University of Madhya Pradesh, Bhopal, Madhya Pradesh, India
Piyush Kumar Shukla
Department of Artificial Intelligence and Machine Learning and IPR Cell, Nitte Meenakshi Institute of Technology, Bengaluru, India
Piyush Kumar Pareek
Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur, Andhra Pradesh, 522302, India
Prashant Kumar Shukla

Authors

Sangeeta Yadav
View author publications
You can also search for this author in PubMed Google Scholar
Preeti Gulia
View author publications
You can also search for this author in PubMed Google Scholar
Nasib Singh Gill
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Yahya
View author publications
You can also search for this author in PubMed Google Scholar
Piyush Kumar Shukla
View author publications
You can also search for this author in PubMed Google Scholar
Piyush Kumar Pareek
View author publications
You can also search for this author in PubMed Google Scholar
Prashant Kumar Shukla
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.Y. and P.G. were involved in conceptualization; Formal analysis was done by S.Y.; S.Y., P.G. and N.S.G. helped in methodology; S.Y. contributed to resources; Supervision was done by P.G. and N.S.G.; P.G. and N.S.G. helped in validation; S.Y. and P.G. were involved in visualization; S.Y. helped in writing—original draft; P.K.S. helped in writing—review & editing, M.Y. was involved in validation, P.K.S. assisted in formal analysis, Supervision was done by P.K.P., P.K.S. helped in visualization. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Piyush Kumar Shukla.

Ethics declarations

Conflicts of interests

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yadav, S., Gulia, P., Gill, N.S. et al. A video compression-cum-classification network for classification from compressed video streams. Vis Comput 40, 7539–7558 (2024). https://doi.org/10.1007/s00371-023-03242-w

Download citation

Accepted: 30 November 2023
Published: 08 March 2024
Issue Date: November 2024
DOI: https://doi.org/10.1007/s00371-023-03242-w

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A video compression-cum-classification network for classification from compressed video streams

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Research on Video Compression Algorithm Based on Deep Learning

Fast Video Classification with CNNs in Compressed Domain

Storing and Compressing Video into Neural Networks by Overfitting

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

A video compression-cum-classification network for classification from compressed video streams

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Research on Video Compression Algorithm Based on Deep Learning

Fast Video Classification with CNNs in Compressed Domain

Storing and Compressing Video into Neural Networks by Overfitting

Explore related subjects

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.