TOPIQ: A Top-Down Approach From Semantics to Distortions for Image Quality Assessment

被引:52
作者
Chen, Chaofeng [1 ]
Mo, Jiadi [1 ]
Hou, Jingwen [2 ]
Wu, Haoning [1 ]
Liao, Liang [1 ]
Sun, Wenxiu [3 ,4 ]
Yan, Qiong [3 ,4 ]
Lin, Weisi [2 ]
机构
[1] Nanyang Technol Univ NTU, S Lab, Singapore 639798, Singapore
[2] Nanyang Technol Univ NTU, Sch Comp Sci & Engn, Singapore 639798, Singapore
[3] Tetras AI, Hong Kong, Peoples R China
[4] SenseTime Res, Hong Kong, Peoples R China
关键词
Image quality assessment; top-down approach; multi-scale features; cross-scale attention; SIMILARITY;
D O I
10.1109/TIP.2024.3378466
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image Quality Assessment (IQA) is a fundamental task in computer vision that has witnessed remarkable progress with deep neural networks. Inspired by the characteristics of the human visual system, existing methods typically use a combination of global and local representations (i.e., multi-scale features) to achieve superior performance. However, most of them adopt simple linear fusion of multi-scale features, and neglect their possibly complex relationship and interaction. In contrast, humans typically first form a global impression to locate important regions and then focus on local details in those regions. We therefore propose a top-down approach that uses high-level semantics to guide the IQA network to focus on semantically important local distortion regions, named as TOPIQ. Our approach to IQA involves the design of a heuristic coarse-to-fine network (CFANet) that leverages multi-scale features and progressively propagates multi-level semantic information to low-level representations in a top-down manner. A key component of our approach is the proposed cross-scale attention mechanism, which calculates attention maps for lower level features guided by higher level features. This mechanism emphasizes active semantic regions for low-level distortions, thereby improving performance. TOPIQ can be used for both Full-Reference (FR) and No-Reference (NR) IQA. We use ResNet50 as its backbone and demonstrate that TOPIQ achieves better or competitive performance on most public FR and NR benchmarks compared with state-of-the-art methods based on vision transformers, while being much more efficient (with only similar to 13% FLOPS of the current best FR method). Codes are released at https://github.com/chaofengc/IQA-PyTorch.
引用
收藏
页码:2404 / 2418
页数:15
相关论文
共 87 条
[1]  
[Anonymous], 2020, P IEEE C COMP VIS PA
[2]  
[Anonymous], 2019, P 11 INT C QUAL MULT, DOI DOI 10.1145/3339186.3339193
[3]   The 2018 PIRM Challenge on Perceptual Image Super-Resolution [J].
Blau, Yochai ;
Mechrez, Roey ;
Timofte, Radu ;
Michaeli, Tomer ;
Zelnik-Manor, Lihi .
COMPUTER VISION - ECCV 2018 WORKSHOPS, PT V, 2019, 11133 :334-355
[4]   Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment [J].
Bosse, Sebastian ;
Maniry, Dominique ;
Mueller, Klaus-Robert ;
Wiegand, Thomas ;
Samek, Wojciech .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (01) :206-219
[5]  
BRADLEY RA, 1952, BIOMETRIKA, V39, P324, DOI 10.1093/biomet/39.3-4.324
[6]   Perceptual Image Quality Assessment with Transformers [J].
Cheon, Manri ;
Yoon, Sung-Jun ;
Kang, Byungyeon ;
Lee, Junwoo .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, :433-442
[7]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[8]   Image Quality Assessment: Unifying Structure and Texture Similarity [J].
Ding, Keyan ;
Ma, Kede ;
Wang, Shiqi ;
Simoncelli, Eero P. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (05) :2567-2581
[9]   Perceptual Quality Assessment of Smartphone Photography [J].
Fang, Yuming ;
Zhu, Hanwei ;
Zeng, Yan ;
Ma, Kede ;
Wang, Zhou .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :3674-3683
[10]   Massive Online Crowdsourced Study of Subjective and Objective Picture Quality [J].
Ghadiyaram, Deepti ;
Bovik, Alan C. .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (01) :372-387