Blind Prediction of Natural Video Quality

被引:366
作者
Saad, Michele A. [1 ]
Bovik, Alan C. [1 ]
Charrier, Christophe [2 ]
机构
[1] Univ Texas Austin, Dept Elect & Comp Engn, Austin, TX 78712 USA
[2] Univ Caen, Dept Elect & Comp Engn, F-50000 St Lo, France
基金
美国国家科学基金会;
关键词
Video quality assessment; discrete cosine transform; egomotion; generalized Gaussian; MOTION; STATISTICS; ALGORITHM; RESPONSES;
D O I
10.1109/TIP.2014.2299154
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a blind (no reference or NR) video quality evaluation model that is nondistortion specific. The approach relies on a spatio-temporal model of video scenes in the discrete cosine transform domain, and on a model that characterizes the type of motion occurring in the scenes, to predict video quality. We use the models to define video statistics and perceptual features that are the basis of a video quality assessment (VQA) algorithm that does not require the presence of a pristine video to compare against in order to predict a perceptual quality score. The contributions of this paper are threefold. 1) We propose a spatio-temporal natural scene statistics (NSS) model for videos. 2) We propose a motion model that quantifies motion coherency in video scenes. 3) We show that the proposed NSS and motion coherency models are appropriate for quality assessment of videos, and we utilize them to design a blind VQA algorithm that correlates highly with human judgments of quality. The proposed algorithm, called video BLIINDS, is tested on the LIVE VQA database and on the EPFL-PoliMi video database and shown to perform close to the level of top performing reduced and full reference VQA algorithms.
引用
收藏
页码:1352 / 1365
页数:14
相关论文
共 53 条
[1]  
[Anonymous], 2007, H 264 MPEG 4 AVC REF
[2]   Temporal Trajectory Aware Video Quality Measure [J].
Barkowsky, Marcus ;
Bialkowski, Jens ;
Eskofier, Bjoern ;
Bitto, Roland ;
Kaup, Andre .
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2009, 3 (02) :266-279
[3]  
Blake R., 2006, Perception, Vfifth
[4]   Structure and function of visual area MT [J].
Born, RT ;
Bradley, DC .
ANNUAL REVIEW OF NEUROSCIENCE, 2005, 28 :157-189
[5]   No-Reference Video quality assessment of H.264 video streams based on semantic saliency maps [J].
Boujut, H. ;
Benois-Pineau, J. ;
Ahmed, T. ;
Hadar, O. ;
Bonnet, P. .
IMAGE QUALITY AND SYSTEM PERFORMANCE IX, 2012, 8293
[6]   A GENERALIZATION OF MEDIAN FILTERING USING LINEAR-COMBINATIONS OF ORDER-STATISTICS [J].
BOVIK, AC ;
HUANG, TS ;
MUNSON, DC .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1983, 31 (06) :1342-1350
[7]   No-Reference Quality Assessment of H.264/AVC Encoded Video [J].
Brandao, Tomas ;
Queluz, Maria Paula .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2010, 20 (11) :1437-1447
[8]   VSNR: A wavelet-based visual signal-to-noise ratio for natural images [J].
Chandler, Damon M. ;
Hemami, Sheila S. .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2007, 16 (09) :2284-2298
[9]  
CHEN WH, 1977, IEEE T COMMUN, V25, P1004, DOI 10.1109/TCOM.1977.1093941
[10]  
Choi L.K., 2012, J VISUAL-JAPAN, V12, P777