A Comprehensive Performance Evaluation of Image Quality Assessment Algorithms

被引:81
作者
Athar, Shahrukh [1 ]
Wang, Zhou [1 ]
机构
[1] Univ Waterloo, Dept Elect & Comp Engn, Waterloo, ON N2L 3G1, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Performance evaluation; Databases; Testing; Distortion; Image quality; Prediction algorithms; Visualization; Image quality assessment; performance evaluation; image quality study; full-reference IQA; no-reference IQA; FR fusion; rank aggregation; image databases; NATURAL SCENE STATISTICS; STRUCTURAL SIMILARITY; PREDICTION; INDEX; CLASSIFICATION; INFORMATION; FRAMEWORK; DATABASE; METRICS; SCORES;
D O I
10.1109/ACCESS.2019.2943319
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Image quality assessment (IQA) algorithms aim to predict perceived image quality by human observers. Over the last two decades, a large amount of work has been carried out in the field. New algorithms are being developed at a rapid rate in different areas of IQA, but are often tested and compared with limited existing models using out-of-date test data. There is a significant gap when it comes to large-scale performance evaluation studies that include a wide variety of test data and competing algorithms. In this work we aim to fill this gap by carrying out the largest performance evaluation study so far. We test the performance of 43 full-reference (FR), seven fused FR (22 versions), and 14 no-reference (NR) methods on nine subject-rated IQA datasets, of which five contain singly distorted images and four contain multiply distorted content. We use a variety of performance evaluation and statistical significance testing criteria. Our findings not only point to the top performing FR and NR IQA methods, but also highlight the performance gap between them. In addition, we have also conducted a comparative study on FR fusion methods, and an important discovery is that rank aggregation based FR fusion is able to outperform not only other FR fusion approaches but also the top performing FR methods. It may be used to annotate IQA datasets as a possible alternative to subjective ratings, especially in situations where it is not possible to obtain human opinions, such as in the case of large-scale datasets composed of thousands or even millions of images.
引用
收藏
页码:140030 / 140070
页数:41
相关论文
共 168 条
[101]  
Pedersen M, 2015, IEEE IMAGE PROC, P1588, DOI 10.1109/ICIP.2015.7351068
[102]   Full-Reference Image Quality Metrics: Classification and Evaluation [J].
Pedersen, Marius ;
Hardeberg, Jon Yngve .
FOUNDATIONS AND TRENDS IN COMPUTER GRAPHICS AND VISION, 2011, 7 (01) :1-80
[103]  
Ponomarenko N., 2011, 2011 11th International Conference The Experience of Designing and Application of CAD Systems in Microelectronics (CADSM 2011), P305
[104]  
Ponomarenko N., 2009, Adv. Modern Radioelectron, V10, P30
[105]  
Ponomarenko N., 2007, P 3 INT WORKSH VID P, VVolume 4
[106]   Image database TID2013: Peculiarities, results and perspectives [J].
Ponomarenko, Nikolay ;
Jin, Lina ;
Ieremeiev, Oleg ;
Lukin, Vladimir ;
Egiazarian, Karen ;
Astola, Jaakko ;
Vozel, Benoit ;
Chehdi, Kacem ;
Carli, Marco ;
Battisti, Federica ;
Kuo, C. -C. Jay .
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2015, 30 :57-77
[107]  
Ponomarenko N, 2013, LECT NOTES COMPUT SC, V8192, P402, DOI 10.1007/978-3-319-02895-8_36
[108]   Display Device-Adapted Video Quality-of-Experience Assessment [J].
Rehman, Abdul ;
Zeng, Kai ;
Wang, Zhou .
HUMAN VISION AND ELECTRONIC IMAGING XX, 2015, 9394
[109]   A novel discrete wavelet transform framework for full reference image quality assessment [J].
Rezazadeh, Soroosh ;
Coulombe, Stephane .
SIGNAL IMAGE AND VIDEO PROCESSING, 2013, 7 (03) :559-573
[110]   LOW-COMPLEXITY COMPUTATION OF VISUAL INFORMATION FIDELITY IN THE DISCRETE WAVELET DOMAIN [J].
Rezazadeh, Soroosh ;
Coulombe, Stephane .
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, :2438-2441