Lightweight transformer and multi-head prediction network for no-reference image quality assessment

被引：3

作者：

Tang, Zhenjun ^{[1
]}

Chen, Yihua ^{[1
]}

Chen, Zhiyuan ^{[1
]}

Liang, Xiaoping ^{[1
]}

Zhang, Xianquan ^{[1
]}

机构：

[1] Guangxi Normal Univ, Key Lab Educ Blockchain & Intelligent Technol, Minist Educ, Guilin 541004, Peoples R China

来源：

NEURAL COMPUTING & APPLICATIONS | 2024年 / 36卷 / 04期

基金：

中国国家自然科学基金;

关键词：

Lightweight transformer; Multi-head prediction; Channel attention; Image quality assessment; NATURAL SCENE STATISTICS;

D O I：

10.1007/s00521-023-09188-3

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

No-reference (NR) image quality assessment (IQA) is an important task of computer vision. Most NR-IQA methods via deep neural networks do not reach desirable IQA performance and have bulky models which make them difficult to be used in the practical scenarios. This paper proposes a lightweight transformer and multi-head prediction network for NR-IQA. The proposed method consists of two lightweight modules: feature extraction and multi-head prediction. The module of feature extraction exploits lightweight transformer blocks to learn features at different scales for measuring different image distortions. The module of multi-head prediction uses three weighted prediction blocks and an FC layer to aggregate the learned features for predicting image quality score. The weighted prediction block can measure the importance of different elements of input feature at the same scale. Since the importance of feature elements at the same scale and the importance of the features at different scales are both considered, the module of multi-head prediction can provide more accurate prediction results. Extensive experiments on the standard IQA datasets are conducted. The results show that the proposed method outperforms some baseline NR-IQA methods in IQA performance on the large image datasets. For the model complexity, the proposed method is also superior to several recent NR-IQA methods.

引用

页码：1947 / 1957

页数：11

共 57 条

[31] Ponomarenko Nikolay, 2013, 2013 4th European Workshop on Visual Information Processing (EUVIP), P106
[32] U-Net: Convolutional Networks for Biomedical Image Segmentation
Ronneberger, Olaf
Fischer, Philipp
Brox, Thomas
[J]. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION, PT III, 2015, 9351 : 234 - 241
[33] Blind Image Quality Assessment: A Natural Scene Statistics Approach in the DCT Domain
Saad, Michele A.
Bovik, Alan C.
Charrier, Christophe
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2012, 21 (08) : 3339 - 3352
[34] MobileNetV2: Inverted Residuals and Linear Bottlenecks
Sandler, Mark
Howard, Andrew
Zhu, Menglong
Zhmoginov, Andrey
Chen, Liang-Chieh
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4510 - 4520
[35] A statistical evaluation of recent full reference image quality assessment algorithms
Sheikh, Hamid Rahim
Sabir, Muhammad Farooq
Bovik, Alan Conrad
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2006, 15 (11) : 3440 - 3451
[36] No-reference quality assessment using natural scene statistics: JPEG2000
Sheikh, HR
Bovik, AC
Cormack, L
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2005, 14 (11) : 1918 - 1927
[37] Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network
Shi, Wenzhe
Caballero, Jose
Huszar, Ferenc
Totz, Johannes
Aitken, Andrew P.
Bishop, Rob
Rueckert, Daniel
Wang, Zehan
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1874 - 1883
[38] Simonyan K, 2015, Arxiv, DOI arXiv:1409.1556
[39] Blindly Assess Image Quality in the Wild Guided by A Self-Adaptive Hyper Network
Su, Shaolin
Yan, Qingsen
Zhu, Yu
Zhang, Cheng
Ge, Xin
Sun, Jinqiu
Zhang, Yanning
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 3664 - 3673
[40] No-reference visually significant blocking artifact metric for natural scene images
Suthaharan, Shan
[J]. SIGNAL PROCESSING, 2009, 89 (08) : 1647 - 1652

← 1 2 3 4 5 6 →