Multitask Deep Neural Network With Knowledge-Guided Attention for Blind Image Quality Assessment

被引:5
作者
Zhou, Tianwei [1 ]
Tan, Songbai [1 ]
Zhao, Baoquan [2 ]
Yue, Guanghui [3 ,4 ]
机构
[1] Shenzhen Univ, Sch Management, Shenzhen, Peoples R China
[2] Sun Yat Sen Univ, Sch Artificial Intelligence, Zhuhai 519082, Peoples R China
[3] Shenzhen Univ, Sch Biomed Engn, Natl Reg Key Technol Engn Lab Med Ultrasound, Med Sch, Shenzhen 518060, Peoples R China
[4] Shenzhen Univ, Med Sch, Sch Biomed Engn, Guangdong Key Lab Biomed Measurements & Ultrasound, Shenzhen 518060, Peoples R China
基金
中国国家自然科学基金;
关键词
Distortion; Task analysis; Feature extraction; Image quality; Databases; Transformers; Circuits and systems; Image quality assessment; synthetic distortion; knowledge-guided attention; multitask learning; transformer; DISTORTED IMAGES; STATISTICS;
D O I
10.1109/TCSVT.2024.3375344
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Blind image quality assessment (BIQA) targets predict the perceptual quality of an image without any reference information. However, known methods have considerable room for performance improvement due to limited efforts in distortion knowledge usage. This paper proposes a novel multitask learning based BIQA method termed KGANet, which takes image distortion classification as an auxiliary task and uses the knowledge learned from the auxiliary task to assist accurate quality prediction. Different from existing CNN-based methods, KGANet adopts a transformer as the backbone for feature extraction, which can learn more powerful and robust representations. Specifically, it comprises two essential components: a cross-layer information fusion (CIF) module and a knowledge-guided attention (KGA) module. Considering that both global and local distortions appear in an image, CIF fuses the features of the adjacent layers extracted by the backbone to obtain a multiscale feature representation. KGA incorporates the distortion probability estimated by the auxiliary task with the distortion embeddings, which are selected from subword unit embeddings based on a textual template, to form distortion knowledge. This knowledge further serves as guidance to enhance the features of each layer and strengthen the connection between the main and auxiliary task. We demonstrate the effectiveness of the proposed KGANet through extensive experiments on benchmark databases. Experimental results show that KGANet correlates well with subjective perceptual judgments and achieves superior performance over 12 state-of-the-art BIQA methods.
引用
收藏
页码:7577 / 7588
页数:12
相关论文
共 61 条
  • [1] Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment
    Bosse, Sebastian
    Maniry, Dominique
    Mueller, Klaus-Robert
    Wiegand, Thomas
    Samek, Wojciech
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (01) : 206 - 219
  • [2] Blind image quality assessment by simulating the visual cortex
    Cai, Rongtai
    Fang, Ming
    [J]. VISUAL COMPUTER, 2023, 39 (10) : 4639 - 4656
  • [3] No-Reference Screen Content Image Quality Assessment With Unsupervised Domain Adaptation
    Chen, Baoliang
    Li, Haoliang
    Fan, Hongfei
    Wang, Shiqi
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 5463 - 5476
  • [4] Perceptual Image Quality Assessment with Transformers
    Cheon, Manri
    Yoon, Sung-Jun
    Kang, Byungyeon
    Lee, Junwoo
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 433 - 442
  • [5] No-Reference Quality Assessment of Contrast-Distorted Images Based on Natural Scene Statistics
    Fang, Yuming
    Ma, Kede
    Wang, Zhou
    Lin, Weisi
    Fang, Zhijun
    Zhai, Guangtao
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2015, 22 (07) : 838 - 842
  • [6] Massive Online Crowdsourced Study of Subjective and Objective Picture Quality
    Ghadiyaram, Deepti
    Bovik, Alan C.
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (01) : 372 - 387
  • [7] No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency
    Golestaneh, S. Alireza
    Dadsetan, Saba
    Kitani, Kris M.
    [J]. 2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 3989 - 3999
  • [8] Heinzerling B, 2017, Arxiv, DOI arXiv:1710.02187
  • [9] KonIQ-10k: An Ecologically Valid Database for Deep Learning of Blind Image Quality Assessment
    Hosu, Vlad
    Lin, Hanhe
    Sziranyi, Tamas
    Saupe, Dietmar
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 (29) : 4041 - 4056
  • [10] Kang L, 2015, IEEE IMAGE PROC, P2791, DOI 10.1109/ICIP.2015.7351311