FULL-REFERENCE AND NO-REFERENCE QUALITY ASSESSMENT FOR COMPRESSED USER-GENERATED CONTENT VIDEOS

被引:2
作者
Li, Yang [1 ]
Feng, Longtao [1 ]
Xu, Jingwen [1 ]
Zhang, Tao [1 ]
Liao, Yiting [1 ]
Li, Junlin [1 ]
机构
[1] Bytedance Inc, Media Fdn Team, Beijing, Peoples R China
来源
2021 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW) | 2021年
关键词
User-generated content; video quality assessment; convolutional neural network; Transformer;
D O I
10.1109/ICMEW53276.2021.9456013
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
With the development of video capture devices and network technology, recent years have witnessed an exponential increase of user-generated content (UGC) videos in various sharing platforms. Comparing to professionally generated content (PGC) videos, these UGC videos are generally captured by amateurs using smartphone cameras in various life scenes and contains various in-capture distortions. Besides, these videos undergo multi stages that may affect the perceptual quality before finally viewed by end-users. Complex and diverse distortion types bring difficulties to objective quality assessment. In this paper, we present a data-driven video quality assessment (VQA) method for UGC videos based on a convolutional neural network (CNN) and a Transformer. Specifically, the CNN backbone is used to extract features from frames and the output is fed to the Transformer encoder for the prediction of quality score. The proposed method can be used for both full-reference (FR) and no-reference (NR) VQA with slight adaptations. Our method ranks first and second in MOS track and DMOS track of the challenge on quality assessment of compressed UGC videos [1], respectively.
引用
收藏
页数:6
相关论文
共 34 条
[1]   Spatiotemporal Feature Integration and Model Fusion for Full Reference Video Quality Assessment [J].
Bampis, Christos G. ;
Li, Zhi ;
Bovik, Alan C. .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (08) :2256-2270
[2]   Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment [J].
Bosse, Sebastian ;
Maniry, Dominique ;
Mueller, Klaus-Robert ;
Wiegand, Thomas ;
Samek, Wojciech .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (01) :206-219
[3]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[4]  
Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
[5]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[6]   Video quality assessment by compact representation of energy in 3D-DCT domain [J].
He, Lihuo ;
Lu, Wen ;
Jia, Changcheng ;
Hao, Lei .
NEUROCOMPUTING, 2017, 269 :108-116
[7]   Deep Video Quality Assessor: From Spatio-Temporal Visual Sensitivity to a Convolutional Neural Aggregation Network [J].
Kim, Woojae ;
Kim, Jongyoo ;
Ahn, Sewoong ;
Kim, Jinwoo ;
Lee, Sanghoon .
COMPUTER VISION - ECCV 2018, PT I, 2018, 11205 :224-241
[8]  
Kingma DP, 2014, ADV NEUR IN, V27
[10]   Most apparent distortion: full-reference image quality assessment and the role of strategy [J].
Larson, Eric C. ;
Chandler, Damon M. .
JOURNAL OF ELECTRONIC IMAGING, 2010, 19 (01)