Perception-Oriented U-Shaped Transformer Network for 360-Degree No-Reference Image Quality Assessment

被引:42
作者
Zhou, Mingliang [1 ]
Chen, Lei [1 ]
Wei, Xuekai [1 ]
Liao, Xingran [2 ]
Mao, Qin [3 ,4 ]
Wang, Heqiang [1 ]
Pu, Huayan [5 ]
Luo, Jun [5 ]
Xiang, Tao [1 ]
Fang, Bin [1 ]
机构
[1] Chongqing Univ, Sch Comp Sci, Chongqing 400044, Peoples R China
[2] City Univ Hong Kong, Comp Sci Dept, Hong Kong, Peoples R China
[3] Qiannan Normal Coll Nationalities, Coll Comp & Informat, Duyun 558000, Peoples R China
[4] Qiannan Normal Univ Nationalities, Sch Comp & Informat, Key Lab Complex Syst & Intelligent Optimizat Guizh, Duyun 558000, Peoples R China
[5] Chongqing Univ, State Key Lab Mech Transmiss, Chongqing 400044, Peoples R China
基金
中国国家自然科学基金;
关键词
Image quality assessment; no-reference image quality assessment; 360-degree image; U-shaped transformer; OMNIDIRECTIONAL IMAGE; SALIENCY; CNN;
D O I
10.1109/TBC.2022.3231101
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Generally, 360-degree images have absolute senses of reality and three-dimensionality, providing a wide range of immersive interactions. Due to the novel rendering and display technology of 360-degree images, they have more complex perceptual characteristics than other images. It is challenging to perform comprehensive image quality assessment (IQA) learning by simply stacking multichannel neural network architectures for pre/postprocessing, compression, and rendering tasks. To thoroughly learn the global and local features in 360-degree images, reduce the complexity of multichannel neural network models and simplify the training process, this paper proposes a joint architecture with user perception and an efficient transformer dedicated to 360-degree no-reference (NR) IQA. The input of the proposed method is a 360-degree cube map projection (CMP) image. Furthermore, the proposed 360-degree NRIQA method includes a saliency map-based non-overlapping self-attention selection module and a U-shaped transformer (U-former)-based feature extraction module to account for perceptual region importance and projection distortion. The transformer-based architecture and the weighted average technique are jointly utilized for predicting local perceptual quality. Experimental results obtained on widely used databases show that the proposed model outperforms other state-of-the-art methods in NR 360-degree image quality evaluation cases. Furthermore, a cross-database evaluation and an ablation study also demonstrate the inherent robustness and generalization ability of the proposed model.
引用
收藏
页码:396 / 405
页数:10
相关论文
共 67 条
  • [1] Adhuran J., 2019, PROC 27 EUR SIGNAL P, P1, DOI DOI 10.23919/EUSIPCO.2019.8903145
  • [2] Deep Learning-based Distortion Sensitivity Prediction for Full-Reference Image Quality Assessment
    Ahn, Sewoong
    Choi, Yeji
    Yoon, Kwangjin
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 344 - 353
  • [3] Al-Sabaawi A., 2020, INT C INT SYST DES A, P171, DOI DOI 10.1007/978-3-030-71187-0_16
  • [4] Inpainted Image Quality Evaluation Based on Saliency Map Features
    Amirkhani, Dariush
    Bastanfard, Azam
    [J]. 2019 5TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS 2019), 2019,
  • [5] Quality Measurement for High Dynamic Range Omnidirectional Image Systems
    Cao, Liuyan
    Jiang, Gangyi
    Jiang, Zhidi
    Yu, Mei
    Qi, Yubin
    Ho, Yo-Sung
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70
  • [6] Chen SJ, 2018, IEEE INT CON MULTI
  • [7] A privacy-preserving approach to streaming eye-tracking data
    David-John, Brendan
    Hosfelt, Diane
    Butler, Kevin
    Jain, Eakta
    [J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2021, 27 (05) : 2555 - 2565
  • [8] RFormer: Transformer-Based Generative Adversarial Network for Real Fundus Image Restoration on a New Clinical Benchmark
    Deng, Zhuo
    Cai, Yuanhao
    Chen, Lu
    Gong, Zheng
    Bao, Qiqi
    Yao, Xue
    Fang, Dong
    Yang, Wenming
    Zhang, Shaochong
    Ma, Lan
    [J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (09) : 4645 - 4655
  • [9] Rethinking 360° Image Visual Attention Modelling with Unsupervised Learning.
    Djilali, Yasser Abdelaziz Dahou
    Krishna, Tarun
    McGuinness, Kevin
    O'Connor, Noel E.
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 15394 - 15404
  • [10] A Deep Insight into Measuring Face Image Utility with General and Face-specific Image Quality Metrics
    Fu, Biying
    Chen, Cong
    Henniger, Olaf
    Damer, Naser
    [J]. 2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 1121 - 1130