A video compression-cum-classification network for classification from compressed video streams

被引:1
作者
Yadav, Sangeeta [1 ]
Gulia, Preeti [1 ]
Gill, Nasib Singh [1 ]
Yahya, Mohammad [2 ]
Shukla, Piyush Kumar [3 ]
Pareek, Piyush Kumar [4 ]
Shukla, Prashant Kumar [5 ]
机构
[1] Maharshi Dayanand Univ, Dept Comp Sci & Applicat, Rohtak, Haryana, India
[2] Oakland Univ, Rochester, MI USA
[3] Technol Univ Madhya Pradesh, Univ Inst Technol UIT, Rajiv Gandhi Proudyogiki Vishwavidyalaya RGPV, Dept Comp Sci & Engn, Bhopal, Madhya Pradesh, India
[4] Nitte Meenakshi Inst Technol, Dept Artificial Intelligence & Machine Learning &, Bengaluru, Karnataka, India
[5] Koneru Lakshmaiah Educ Fdn, Dept Comp Sci & Engn, Guntur 522302, Andhra Pradesh, India
关键词
Video analytics; Compression; Content tagging; Deep Learning; CNN; ConvGRU; SSIM; PSNR;
D O I
10.1007/s00371-023-03242-w
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Video analytics can achieve increased speed and efficiency by operating directly on the compressed video format, thereby alleviating the decoding burden on the analytics server. The encoded video streams are rich in semantic binary information and this information can be utilized more efficiently to train the classifiers. Motivated by the same notion, a deep learning-based video compression-cum-classification network has been proposed. In the proposed work, the binary-coded semantic information is extracted by using an auto encoder-based video compression component and the same fed to the MobileNetv2-based classifier for the classification of the given video streams based on their content. Using large-scale user-generated content provided by YouTube UGC dataset, it has been demonstrated that using deep neural networks for compression not only provides on-par compression results to traditional methods, it makes analytical processing of these videos faster. Video content tagging of YouTube UGC dataset has been used as the analytics task. The proposed DLVCC approach performs 10 x faster with 30 x fewer parameters than MobileNetv2 in video tagging of compressed video with no loss in accuracy.
引用
收藏
页码:7539 / 7558
页数:20
相关论文
共 38 条
[1]   Impact of Image Compression on the Performance of Steel Surface Defect Classification with a CNN [J].
Benbarrad, Tajeddine ;
Eloutouate, Lamiae ;
Arioua, Mounir ;
Elouaai, Fatiha ;
Laanaoui, My Driss .
JOURNAL OF SENSOR AND ACTUATOR NETWORKS, 2021, 10 (04)
[2]   Deep Learning Approaches for Video Compression: A Bibliometric Analysis [J].
Bidwe, Ranjeet Vasant ;
Mishra, Sashikala ;
Patil, Shruti ;
Shaw, Kailash ;
Vora, Deepali Rahul ;
Kotecha, Ketan ;
Zope, Bhushan .
BIG DATA AND COGNITIVE COMPUTING, 2022, 6 (02)
[3]   Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].
Carreira, Joao ;
Zisserman, Andrew .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733
[4]  
Chang J.-W., 2022, ARXIV, DOI DOI 10.48550/ARXIV.2203.10183
[5]  
Chao-Yuan W., 2018, ARXIV, DOI DOI 10.48550/ARXIV.1712.00636
[6]  
Chen Z, 2019, ARXIV
[7]   Faster and Accurate Compressed Video Action Recognition Straight from the Frequency Domain [J].
dos Santos, Samuel Felipe ;
Almeida, Jurandy .
2020 33RD SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI 2020), 2020, :62-68
[8]  
Fischer F., 2021, IEEE INT C IMAGE PRO
[9]   First Gradually, Then Suddenly: Understanding the Impact of Image Compression on Object Detection Using Deep Learning [J].
Gandor, Tomasz ;
Nalepa, Jakub .
SENSORS, 2022, 22 (03)
[10]  
GIRDHAR R, 2017, P IEEE C COMP VIS PA, P971, DOI DOI 10.1109/CVPR.2017.337