KonVid-150k: A Dataset for No-Reference Video Quality Assessment of Videos in-the-Wild

被引：33

作者：

Goetz-Hahn, Franz ^{[1
]}

Hosu, Vlad ^{[1
]}

Lin, Hanhe ^{[1
]}

Saupe, Dietmar ^{[1
]}

机构：

[1] Univ Konstanz, Dept Comp Sci, D-78464 Constance, Germany

来源：

IEEE ACCESS | 2021年 / 9卷

关键词：

Streaming media; Distortion; Feature extraction; Quality assessment; Video recording; Training; Cameras; Datasets; deep transfer learning; multi-level spatially-pooled features; video quality assessment; video quality dataset; PREDICTION;

D O I：

10.1109/ACCESS.2021.3077642

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Video quality assessment (VQA) methods focus on particular degradation types, usually artificially induced on a small set of reference videos. Hence, most traditional VQA methods under-perform in-the-wild. Deep learning approaches have had limited success due to the small size and diversity of existing VQA datasets, either artificial or authentically distorted. We introduce a new in-the-wild VQA dataset that is substantially larger and diverse: KonVid-150k. It consists of a coarsely annotated set of 153,841 videos having five quality ratings each, and 1,596 videos with a minimum of 89 ratings each. Additionally, we propose new efficient VQA approaches (MLSP-VQA) relying on multi-level spatially pooled deep-features (MLSP). They are exceptionally well suited for training at scale, compared to deep transfer learning approaches. Our best method, MLSP-VQA-FF, improves the Spearman rank-order correlation coefficient (SRCC) performance metric on the commonly used KoNViD-1k in-the-wild benchmark dataset to 0.82. It surpasses the best existing deep-learning model (0.80 SRCC) and hand-crafted feature-based method (0.78 SRCC). We further investigate how alternative approaches perform under different levels of label noise, and dataset size, showing that MLSP-VQA-FF is the overall best method for videos in-the-wild. Finally, we show that the MLSP-VQA models trained on KonVid-150k sets the new state-of-the-art for cross-test performance on KoNViD-1k and LIVE-Qualcomm with a 0.83 and 0.64 SRCC, respectively. For KoNViD-1k this inter-dataset testing outperforms intra-dataset experiments, showing excellent generalization.

引用

页码：72139 / 72160

页数：22

共 50 条

[1] No-Reference Video Quality Assessment Using Transformers and Attention Recurrent Networks
Kossi, Koffi
Coulombe, Stephane
Desrosiers, Christian
IEEE ACCESS, 2024, 12 : 140671 - 140680
[2] Multi-Dimensional Feature Fusion Network for No-Reference Quality Assessment of In-the-Wild Videos
Jiang, Jiu
Wang, Xianpei
Li, Bowen
Tian, Meng
Yao, Hongtai
SENSORS, 2021, 21 (16)
[3] No-Reference Video Quality Assessment Using Distortion Learning and Temporal Attention
Kossi, Koffi
Coulombe, Stephane
Desrosiers, Christian
Gagnon, Ghyslain
IEEE ACCESS, 2022, 10 : 41010 - 41022
[4] Quality Assessment of In-the-Wild Videos
Li, Dingquan
Jiang, Tingting
Jiang, Ming
PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 2351 - 2359
[5] No-Reference Video Quality Assessment Using Local Structural and Quality-Aware Deep Features
Vishwakarma, Anish Kumar
Bhurchandi, Kishor M.
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
[6] Learning Based Hybrid No-reference Video Quality Assessment of Compressed Videos
Fazliani, Yasamin
Andrade, Ernesto
Shirani, Shahram
2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,
[7] No-Reference Nonuniform Distorted Video Quality Assessment Based on Deep Multiple Instance Learning
Qian, Lihui
Pan, Tianxiang
Zheng, Yunfei
Zhang, Jiajie
Li, Mading
Yu, Bing
Wang, Bin
IEEE MULTIMEDIA, 2021, 28 (01) : 28 - 37
[8] No-Reference Video Quality Assessment Using Natural Spatiotemporal Scene Statistics
Dendi, Sathya Veera Reddy
Channappayya, Sumohana S.
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 5612 - 5624
[9] A No-Reference Quality Assessment Model for Screen Content Videos via Hierarchical Spatiotemporal Perception
Liu, Zhihong
Zeng, Huanqiang
Chen, Jing
Ding, Rui
Shi, Yifan
Hou, Junhui
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (02) : 1422 - 1435
[10] Learning Generalized Spatial-Temporal Deep Feature Representation for No-Reference Video Quality Assessment
Chen, Baoliang
Zhu, Lingyu
Li, Guo
Lu, Fangbo
Fan, Hongfei
Wang, Shiqi
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (04) : 1903 - 1916

← 1 2 3 4 5 →