KonVid-150k: A Dataset for No-Reference Video Quality Assessment of Videos in-the-Wild

被引:33
|
作者
Goetz-Hahn, Franz [1 ]
Hosu, Vlad [1 ]
Lin, Hanhe [1 ]
Saupe, Dietmar [1 ]
机构
[1] Univ Konstanz, Dept Comp Sci, D-78464 Constance, Germany
关键词
Streaming media; Distortion; Feature extraction; Quality assessment; Video recording; Training; Cameras; Datasets; deep transfer learning; multi-level spatially-pooled features; video quality assessment; video quality dataset; PREDICTION;
D O I
10.1109/ACCESS.2021.3077642
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Video quality assessment (VQA) methods focus on particular degradation types, usually artificially induced on a small set of reference videos. Hence, most traditional VQA methods under-perform in-the-wild. Deep learning approaches have had limited success due to the small size and diversity of existing VQA datasets, either artificial or authentically distorted. We introduce a new in-the-wild VQA dataset that is substantially larger and diverse: KonVid-150k. It consists of a coarsely annotated set of 153,841 videos having five quality ratings each, and 1,596 videos with a minimum of 89 ratings each. Additionally, we propose new efficient VQA approaches (MLSP-VQA) relying on multi-level spatially pooled deep-features (MLSP). They are exceptionally well suited for training at scale, compared to deep transfer learning approaches. Our best method, MLSP-VQA-FF, improves the Spearman rank-order correlation coefficient (SRCC) performance metric on the commonly used KoNViD-1k in-the-wild benchmark dataset to 0.82. It surpasses the best existing deep-learning model (0.80 SRCC) and hand-crafted feature-based method (0.78 SRCC). We further investigate how alternative approaches perform under different levels of label noise, and dataset size, showing that MLSP-VQA-FF is the overall best method for videos in-the-wild. Finally, we show that the MLSP-VQA models trained on KonVid-150k sets the new state-of-the-art for cross-test performance on KoNViD-1k and LIVE-Qualcomm with a 0.83 and 0.64 SRCC, respectively. For KoNViD-1k this inter-dataset testing outperforms intra-dataset experiments, showing excellent generalization.
引用
收藏
页码:72139 / 72160
页数:22
相关论文
共 50 条
  • [1] No-Reference Video Quality Assessment Using Transformers and Attention Recurrent Networks
    Kossi, Koffi
    Coulombe, Stephane
    Desrosiers, Christian
    IEEE ACCESS, 2024, 12 : 140671 - 140680
  • [2] Multi-Dimensional Feature Fusion Network for No-Reference Quality Assessment of In-the-Wild Videos
    Jiang, Jiu
    Wang, Xianpei
    Li, Bowen
    Tian, Meng
    Yao, Hongtai
    SENSORS, 2021, 21 (16)
  • [3] No-Reference Video Quality Assessment Using Distortion Learning and Temporal Attention
    Kossi, Koffi
    Coulombe, Stephane
    Desrosiers, Christian
    Gagnon, Ghyslain
    IEEE ACCESS, 2022, 10 : 41010 - 41022
  • [4] Quality Assessment of In-the-Wild Videos
    Li, Dingquan
    Jiang, Tingting
    Jiang, Ming
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 2351 - 2359
  • [5] No-Reference Video Quality Assessment Using Local Structural and Quality-Aware Deep Features
    Vishwakarma, Anish Kumar
    Bhurchandi, Kishor M.
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [6] Learning Based Hybrid No-reference Video Quality Assessment of Compressed Videos
    Fazliani, Yasamin
    Andrade, Ernesto
    Shirani, Shahram
    2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,
  • [7] No-Reference Nonuniform Distorted Video Quality Assessment Based on Deep Multiple Instance Learning
    Qian, Lihui
    Pan, Tianxiang
    Zheng, Yunfei
    Zhang, Jiajie
    Li, Mading
    Yu, Bing
    Wang, Bin
    IEEE MULTIMEDIA, 2021, 28 (01) : 28 - 37
  • [8] No-Reference Video Quality Assessment Using Natural Spatiotemporal Scene Statistics
    Dendi, Sathya Veera Reddy
    Channappayya, Sumohana S.
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 5612 - 5624
  • [9] A No-Reference Quality Assessment Model for Screen Content Videos via Hierarchical Spatiotemporal Perception
    Liu, Zhihong
    Zeng, Huanqiang
    Chen, Jing
    Ding, Rui
    Shi, Yifan
    Hou, Junhui
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (02) : 1422 - 1435
  • [10] Learning Generalized Spatial-Temporal Deep Feature Representation for No-Reference Video Quality Assessment
    Chen, Baoliang
    Zhu, Lingyu
    Li, Guo
    Lu, Fangbo
    Fan, Hongfei
    Wang, Shiqi
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (04) : 1903 - 1916