TOPIQ: A Top-Down Approach From Semantics to Distortions for Image Quality Assessment

被引：52

作者：

Chen, Chaofeng ^{[1
]}

Mo, Jiadi ^{[1
]}

Hou, Jingwen ^{[2
]}

Wu, Haoning ^{[1
]}

Liao, Liang ^{[1
]}

Sun, Wenxiu ^{[3
,4
]}

Yan, Qiong ^{[3
,4
]}

Lin, Weisi ^{[2
]}

机构：

[1] Nanyang Technol Univ NTU, S Lab, Singapore 639798, Singapore

[2] Nanyang Technol Univ NTU, Sch Comp Sci & Engn, Singapore 639798, Singapore

[3] Tetras AI, Hong Kong, Peoples R China

[4] SenseTime Res, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2024年 / 33卷

关键词：

Image quality assessment; top-down approach; multi-scale features; cross-scale attention; SIMILARITY;

D O I：

10.1109/TIP.2024.3378466

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Image Quality Assessment (IQA) is a fundamental task in computer vision that has witnessed remarkable progress with deep neural networks. Inspired by the characteristics of the human visual system, existing methods typically use a combination of global and local representations (i.e., multi-scale features) to achieve superior performance. However, most of them adopt simple linear fusion of multi-scale features, and neglect their possibly complex relationship and interaction. In contrast, humans typically first form a global impression to locate important regions and then focus on local details in those regions. We therefore propose a top-down approach that uses high-level semantics to guide the IQA network to focus on semantically important local distortion regions, named as TOPIQ. Our approach to IQA involves the design of a heuristic coarse-to-fine network (CFANet) that leverages multi-scale features and progressively propagates multi-level semantic information to low-level representations in a top-down manner. A key component of our approach is the proposed cross-scale attention mechanism, which calculates attention maps for lower level features guided by higher level features. This mechanism emphasizes active semantic regions for low-level distortions, thereby improving performance. TOPIQ can be used for both Full-Reference (FR) and No-Reference (NR) IQA. We use ResNet50 as its backbone and demonstrate that TOPIQ achieves better or competitive performance on most public FR and NR benchmarks compared with state-of-the-art methods based on vision transformers, while being much more efficient (with only similar to 13% FLOPS of the current best FR method). Codes are released at https://github.com/chaofengc/IQA-PyTorch.

引用

页码：2404 / 2418

页数：15

共 87 条

[1]

[Anonymous], 2020, P IEEE C COMP VIS PA

[2]

[Anonymous], 2019, P 11 INT C QUAL MULT, DOI DOI 10.1145/3339186.3339193

[3] The 2018 PIRM Challenge on Perceptual Image Super-Resolution [J].

Blau, Yochai ;

Mechrez, Roey ;

Timofte, Radu ;

Michaeli, Tomer ;

Zelnik-Manor, Lihi .

COMPUTER VISION - ECCV 2018 WORKSHOPS, PT V, 2019, 11133 :334-355

[4] Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment [J].

Bosse, Sebastian ;

Maniry, Dominique ;

Mueller, Klaus-Robert ;

Wiegand, Thomas ;

Samek, Wojciech .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (01) :206-219

[5]

BRADLEY RA, 1952, BIOMETRIKA, V39, P324, DOI 10.1093/biomet/39.3-4.324

[6] Perceptual Image Quality Assessment with Transformers [J].

Cheon, Manri ;

Yoon, Sung-Jun ;

Kang, Byungyeon ;

Lee, Junwoo .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, :433-442

[7]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[8] Image Quality Assessment: Unifying Structure and Texture Similarity [J].

Ding, Keyan ;

Ma, Kede ;

Wang, Shiqi ;

Simoncelli, Eero P. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (05) :2567-2581

[9] Perceptual Quality Assessment of Smartphone Photography [J].

Fang, Yuming ;

Zhu, Hanwei ;

Zeng, Yan ;

Ma, Kede ;

Wang, Zhou .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :3674-3683

[10] Massive Online Crowdsourced Study of Subjective and Objective Picture Quality [J].

Ghadiyaram, Deepti ;

Bovik, Alan C. .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (01) :372-387

← 1 2 3 4 5 6 7 8 9 →