Perception-Oriented U-Shaped Transformer Network for 360-Degree No-Reference Image Quality Assessment

被引:42
作者
Zhou, Mingliang [1 ]
Chen, Lei [1 ]
Wei, Xuekai [1 ]
Liao, Xingran [2 ]
Mao, Qin [3 ,4 ]
Wang, Heqiang [1 ]
Pu, Huayan [5 ]
Luo, Jun [5 ]
Xiang, Tao [1 ]
Fang, Bin [1 ]
机构
[1] Chongqing Univ, Sch Comp Sci, Chongqing 400044, Peoples R China
[2] City Univ Hong Kong, Comp Sci Dept, Hong Kong, Peoples R China
[3] Qiannan Normal Coll Nationalities, Coll Comp & Informat, Duyun 558000, Peoples R China
[4] Qiannan Normal Univ Nationalities, Sch Comp & Informat, Key Lab Complex Syst & Intelligent Optimizat Guizh, Duyun 558000, Peoples R China
[5] Chongqing Univ, State Key Lab Mech Transmiss, Chongqing 400044, Peoples R China
基金
中国国家自然科学基金;
关键词
Image quality assessment; no-reference image quality assessment; 360-degree image; U-shaped transformer; OMNIDIRECTIONAL IMAGE; SALIENCY; CNN;
D O I
10.1109/TBC.2022.3231101
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Generally, 360-degree images have absolute senses of reality and three-dimensionality, providing a wide range of immersive interactions. Due to the novel rendering and display technology of 360-degree images, they have more complex perceptual characteristics than other images. It is challenging to perform comprehensive image quality assessment (IQA) learning by simply stacking multichannel neural network architectures for pre/postprocessing, compression, and rendering tasks. To thoroughly learn the global and local features in 360-degree images, reduce the complexity of multichannel neural network models and simplify the training process, this paper proposes a joint architecture with user perception and an efficient transformer dedicated to 360-degree no-reference (NR) IQA. The input of the proposed method is a 360-degree cube map projection (CMP) image. Furthermore, the proposed 360-degree NRIQA method includes a saliency map-based non-overlapping self-attention selection module and a U-shaped transformer (U-former)-based feature extraction module to account for perceptual region importance and projection distortion. The transformer-based architecture and the weighted average technique are jointly utilized for predicting local perceptual quality. Experimental results obtained on widely used databases show that the proposed model outperforms other state-of-the-art methods in NR 360-degree image quality evaluation cases. Furthermore, a cross-database evaluation and an ablation study also demonstrate the inherent robustness and generalization ability of the proposed model.
引用
收藏
页码:396 / 405
页数:10
相关论文
共 67 条
  • [51] State-of-the-Art in 360° Video/Image Processing: Perception, Assessment and Compression
    Xu, Mai
    Li, Chen
    Zhang, Shanyi
    Le Callet, Patrick
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2020, 14 (01) : 5 - 26
  • [52] Xu M, 2019, IEEE T CIRC SYST VID, V29, P3516, DOI [10.1109/TCSVT.2018.2886277, 10.1080/17445302.2018.1558727]
  • [53] Spherical DNNs and Their Applications in 360° Images and Videos
    Xu, Yanyu
    Zhang, Ziheng
    Gao, Shenghua
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) : 7235 - 7252
  • [54] Transformers in computational visual media: A survey
    Xu, Yifan
    Wei, Huapeng
    Lin, Minxuan
    Deng, Yingying
    Sheng, Kekai
    Zhang, Mengdan
    Tang, Fan
    Dong, Weiming
    Huang, Feiyue
    Xu, Changsheng
    [J]. COMPUTATIONAL VISUAL MEDIA, 2022, 8 (01) : 33 - 62
  • [55] No Reference Quality Assessment of Stereo Video Based on Saliency and Sparsity
    Yang, Jiachen
    Ji, Chunqi
    Jiang, Bin
    Lu, Wen
    Meng, Qinggang
    [J]. IEEE TRANSACTIONS ON BROADCASTING, 2018, 64 (02) : 341 - 353
  • [56] A Survey on Swarm Microrobotics
    Yang, Lidong
    Yu, Jiangfan
    Yang, Shihao
    Wang, Ben
    Nelson, Bradley J.
    Zhang, Li
    [J]. IEEE TRANSACTIONS ON ROBOTICS, 2022, 38 (03) : 1531 - 1551
  • [57] No-Reference Point Cloud Quality Assessment via Domain Adaptation
    Yang, Qi
    Liu, Yipeng
    Chen, Siheng
    Xu, Yiling
    Sun, Jun
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 21147 - 21156
  • [58] A Survey on Adaptive 360° Video Streaming: Solutions, Challenges and Opportunities
    Yaqoob, Abid
    Bi, Ting
    Muntean, Gabriel-Miro
    [J]. IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2020, 22 (04): : 2801 - 2838
  • [59] Restormer: Efficient Transformer for High-Resolution Image Restoration
    Zamir, Syed Waqas
    Arora, Aditya
    Khan, Salman
    Hayat, Munawar
    Khan, Fahad Shahbaz
    Yang, Ming-Hsuan
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5718 - 5729
  • [60] Zhang L, 2012, IEEE IMAGE PROC, P1477, DOI 10.1109/ICIP.2012.6467150