Multitask Deep Neural Network With Knowledge-Guided Attention for Blind Image Quality Assessment

被引：5

作者：

Zhou, Tianwei ^{[1
]}

Tan, Songbai ^{[1
]}

Zhao, Baoquan ^{[2
]}

Yue, Guanghui ^{[3
,4
]}

机构：

[1] Shenzhen Univ, Sch Management, Shenzhen, Peoples R China

[2] Sun Yat Sen Univ, Sch Artificial Intelligence, Zhuhai 519082, Peoples R China

[3] Shenzhen Univ, Sch Biomed Engn, Natl Reg Key Technol Engn Lab Med Ultrasound, Med Sch, Shenzhen 518060, Peoples R China

[4] Shenzhen Univ, Med Sch, Sch Biomed Engn, Guangdong Key Lab Biomed Measurements & Ultrasound, Shenzhen 518060, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 08期

基金：

中国国家自然科学基金;

关键词：

Distortion; Task analysis; Feature extraction; Image quality; Databases; Transformers; Circuits and systems; Image quality assessment; synthetic distortion; knowledge-guided attention; multitask learning; transformer; DISTORTED IMAGES; STATISTICS;

D O I：

10.1109/TCSVT.2024.3375344

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Blind image quality assessment (BIQA) targets predict the perceptual quality of an image without any reference information. However, known methods have considerable room for performance improvement due to limited efforts in distortion knowledge usage. This paper proposes a novel multitask learning based BIQA method termed KGANet, which takes image distortion classification as an auxiliary task and uses the knowledge learned from the auxiliary task to assist accurate quality prediction. Different from existing CNN-based methods, KGANet adopts a transformer as the backbone for feature extraction, which can learn more powerful and robust representations. Specifically, it comprises two essential components: a cross-layer information fusion (CIF) module and a knowledge-guided attention (KGA) module. Considering that both global and local distortions appear in an image, CIF fuses the features of the adjacent layers extracted by the backbone to obtain a multiscale feature representation. KGA incorporates the distortion probability estimated by the auxiliary task with the distortion embeddings, which are selected from subword unit embeddings based on a textual template, to form distortion knowledge. This knowledge further serves as guidance to enhance the features of each layer and strengthen the connection between the main and auxiliary task. We demonstrate the effectiveness of the proposed KGANet through extensive experiments on benchmark databases. Experimental results show that KGANet correlates well with subjective perceptual judgments and achieves superior performance over 12 state-of-the-art BIQA methods.

引用

页码：7577 / 7588

页数：12

共 61 条

[1] Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment
Bosse, Sebastian
Maniry, Dominique
Mueller, Klaus-Robert
Wiegand, Thomas
Samek, Wojciech
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (01) : 206 - 219
[2] Blind image quality assessment by simulating the visual cortex
Cai, Rongtai
Fang, Ming
[J]. VISUAL COMPUTER, 2023, 39 (10) : 4639 - 4656
[3] No-Reference Screen Content Image Quality Assessment With Unsupervised Domain Adaptation
Chen, Baoliang
Li, Haoliang
Fan, Hongfei
Wang, Shiqi
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 5463 - 5476
[4] Perceptual Image Quality Assessment with Transformers
Cheon, Manri
Yoon, Sung-Jun
Kang, Byungyeon
Lee, Junwoo
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 433 - 442
[5] No-Reference Quality Assessment of Contrast-Distorted Images Based on Natural Scene Statistics
Fang, Yuming
Ma, Kede
Wang, Zhou
Lin, Weisi
Fang, Zhijun
Zhai, Guangtao
[J]. IEEE SIGNAL PROCESSING LETTERS, 2015, 22 (07) : 838 - 842
[6] Massive Online Crowdsourced Study of Subjective and Objective Picture Quality
Ghadiyaram, Deepti
Bovik, Alan C.
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (01) : 372 - 387
[7] No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency
Golestaneh, S. Alireza
Dadsetan, Saba
Kitani, Kris M.
[J]. 2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 3989 - 3999
[8] Heinzerling B, 2017, Arxiv, DOI arXiv:1710.02187
[9] KonIQ-10k: An Ecologically Valid Database for Deep Learning of Blind Image Quality Assessment
Hosu, Vlad
Lin, Hanhe
Sziranyi, Tamas
Saupe, Dietmar
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 (29) : 4041 - 4056
[10] Kang L, 2015, IEEE IMAGE PROC, P2791, DOI 10.1109/ICIP.2015.7351311

← 1 2 3 4 5 6 7 →