Efficient video quality assessment with deeper spatiotemporal feature extraction and integration

被引:5
|
作者
Liu, Yinhao [1 ]
Zhou, Xiaofei [2 ]
Yin, Haibing [1 ]
Wang, Hongkui [1 ]
Yan, Chenggang [2 ]
机构
[1] Hangzhou Dianzi Univ, Sch Commun Engn, Hangzhou, Peoples R China
[2] Hangzhou Dianzi Univ, Sch Automat, Hangzhou, Peoples R China
关键词
video quality assessment; no-reference/blind; user-generated content; deep learning; deeper temporal correlation; reverse hierarchy theory;
D O I
10.1117/1.JEI.30.6.063034
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The challenge of video quality assessment (VQA) modeling for user-generated content (UGC) (i.e., UGC-VQA) is how to accurately extract discriminative features and elaborately quantify interfeature interactions by following the behavior patterns of human eye-brain vision perception. To address this issue, we propose the Deeper Spatial-Temporal Scoring Network (DSTS-Net) to give a precise VQA. Concretely, we first deploy the multiscale feature extraction module to characterize content-aware features accounting for nonlinear reverse hierarchy theory in video perception process, which is not fully considered in the reported UGC-VQA models. Hierarchical handcraft and semantic features are simultaneously considered using content adaptive weighting. Second, we develop a feature integration structure, i.e., deeper gated recurrent unit (DGRU), to fully imitate the interfeature interactions in visionary perception, including feedforward and feedback processes. Third, the dual DGRU structure is employed to further account for interframe interactions of hierarchical features, imitating the nonlinearity of perception as much as possible. Finally, improved pooling is achieved in the local adaptive smoothing module accounting for the temporal hysteresis. Holistic validation of the proposed method on four public challenging UGC-VQA datasets presents a comparable performance over the state-of-the-art no-reference VQA methods, especially our method can give an accurate prediction of the low quality videos with weak temporal correlation. To promote reproducible research and public evaluation, an implementation of our method has been made available online: https://github.com/liu0527aa/DSTS-Net . (C) 2021 SPIE and IS&T
引用
收藏
页数:21
相关论文
共 50 条
  • [1] Spatiotemporal Feature Combination Model for No-Reference Video Quality Assessment
    Men, Hui
    Lin, Hanhe
    Saupe, Dietmar
    2018 TENTH INTERNATIONAL CONFERENCE ON QUALITY OF MULTIMEDIA EXPERIENCE (QOMEX), 2018, : 72 - 74
  • [2] Spatiotemporal Statistics for Video Quality Assessment
    Li, Xuelong
    Guo, Qun
    Lu, Xiaoqiang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (07) : 3329 - 3342
  • [3] Spatiotemporal Masking for Objective Video Quality Assessment
    He, Ran
    Lu, Wen
    Zhang, Yu
    Gao, Xinbo
    He, Lihuo
    PATTERN RECOGNITION AND COMPUTER VISION (PRCV 2018), PT I, 2018, 11256 : 309 - 321
  • [4] An End-to-End No-Reference Video Quality Assessment Method With Hierarchical Spatiotemporal Feature Representation
    Shen, Wenhao
    Zhou, Mingliang
    Liao, Xingran
    Jia, Weijia
    Xiang, Tao
    Fang, Bin
    Shang, Zhaowei
    IEEE TRANSACTIONS ON BROADCASTING, 2022, 68 (03) : 651 - 660
  • [5] Spatiotemporal Saliency Detection based Video Quality Assessment
    Jia, Changcheng
    Lu, Wen
    He, Lihuo
    He, Ran
    8TH INTERNATIONAL CONFERENCE ON INTERNET MULTIMEDIA COMPUTING AND SERVICE (ICIMCS2016), 2016, : 340 - 343
  • [6] Integrates Spatiotemporal Visual Stimuli for Video Quality Assessment
    Guo, Wenzhong
    Zhang, Kairui
    Ke, Xiao
    IEEE TRANSACTIONS ON BROADCASTING, 2024, 70 (01) : 223 - 237
  • [7] A study on deep learning spatiotemporal models and feature extraction techniques for video understanding
    M. Suresha
    S. Kuppa
    D. S. Raghukumar
    International Journal of Multimedia Information Retrieval, 2020, 9 : 81 - 101
  • [8] A study on deep learning spatiotemporal models and feature extraction techniques for video understanding
    Suresha, M.
    Kuppa, S.
    Raghukumar, D. S.
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2020, 9 (02) : 81 - 101
  • [9] A Blind Video Quality Assessment Method via Spatiotemporal Pyramid Attention
    Shen, Wenhao
    Zhou, Mingliang
    Wei, Xuekai
    Wang, Heqiang
    Fang, Bin
    Ji, Cheng
    Zhuang, Xu
    Wang, Jason
    Luo, Jun
    Pu, Huayan
    Huang, Xiaoxu
    Wang, Shilong
    Cao, Huajun
    Feng, Yong
    Xiang, Tao
    Shang, Zhaowei
    IEEE TRANSACTIONS ON BROADCASTING, 2024, 70 (01) : 251 - 264
  • [10] Video quality assessment based on correlation between spatiotemporal motion energies
    Yan, Peng
    Mou, Xuanqin
    APPLICATIONS OF DIGITAL IMAGE PROCESSING XXXIX, 2016, 9971