A video compression-cum-classification network for classification from compressed video streams

被引：1

作者：

Yadav, Sangeeta ^{[1
]}

Gulia, Preeti ^{[1
]}

Gill, Nasib Singh ^{[1
]}

Yahya, Mohammad ^{[2
]}

Shukla, Piyush Kumar ^{[3
]}

Pareek, Piyush Kumar ^{[4
]}

Shukla, Prashant Kumar ^{[5
]}

机构：

[1] Maharshi Dayanand Univ, Dept Comp Sci & Applicat, Rohtak, Haryana, India

[2] Oakland Univ, Rochester, MI USA

[3] Technol Univ Madhya Pradesh, Univ Inst Technol UIT, Rajiv Gandhi Proudyogiki Vishwavidyalaya RGPV, Dept Comp Sci & Engn, Bhopal, Madhya Pradesh, India

[4] Nitte Meenakshi Inst Technol, Dept Artificial Intelligence & Machine Learning &, Bengaluru, Karnataka, India

[5] Koneru Lakshmaiah Educ Fdn, Dept Comp Sci & Engn, Guntur 522302, Andhra Pradesh, India

来源：

VISUAL COMPUTER | 2024年 / 40卷 / 11期

关键词：

Video analytics; Compression; Content tagging; Deep Learning; CNN; ConvGRU; SSIM; PSNR;

D O I：

10.1007/s00371-023-03242-w

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Video analytics can achieve increased speed and efficiency by operating directly on the compressed video format, thereby alleviating the decoding burden on the analytics server. The encoded video streams are rich in semantic binary information and this information can be utilized more efficiently to train the classifiers. Motivated by the same notion, a deep learning-based video compression-cum-classification network has been proposed. In the proposed work, the binary-coded semantic information is extracted by using an auto encoder-based video compression component and the same fed to the MobileNetv2-based classifier for the classification of the given video streams based on their content. Using large-scale user-generated content provided by YouTube UGC dataset, it has been demonstrated that using deep neural networks for compression not only provides on-par compression results to traditional methods, it makes analytical processing of these videos faster. Video content tagging of YouTube UGC dataset has been used as the analytics task. The proposed DLVCC approach performs 10 x faster with 30 x fewer parameters than MobileNetv2 in video tagging of compressed video with no loss in accuracy.

引用

页码：7539 / 7558

页数：20

共 38 条

[1] Impact of Image Compression on the Performance of Steel Surface Defect Classification with a CNN [J].