Lightweight transformer and multi-head prediction network for no-reference image quality assessment

被引:3
作者
Tang, Zhenjun [1 ]
Chen, Yihua [1 ]
Chen, Zhiyuan [1 ]
Liang, Xiaoping [1 ]
Zhang, Xianquan [1 ]
机构
[1] Guangxi Normal Univ, Key Lab Educ Blockchain & Intelligent Technol, Minist Educ, Guilin 541004, Peoples R China
基金
中国国家自然科学基金;
关键词
Lightweight transformer; Multi-head prediction; Channel attention; Image quality assessment; NATURAL SCENE STATISTICS;
D O I
10.1007/s00521-023-09188-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
No-reference (NR) image quality assessment (IQA) is an important task of computer vision. Most NR-IQA methods via deep neural networks do not reach desirable IQA performance and have bulky models which make them difficult to be used in the practical scenarios. This paper proposes a lightweight transformer and multi-head prediction network for NR-IQA. The proposed method consists of two lightweight modules: feature extraction and multi-head prediction. The module of feature extraction exploits lightweight transformer blocks to learn features at different scales for measuring different image distortions. The module of multi-head prediction uses three weighted prediction blocks and an FC layer to aggregate the learned features for predicting image quality score. The weighted prediction block can measure the importance of different elements of input feature at the same scale. Since the importance of feature elements at the same scale and the importance of the features at different scales are both considered, the module of multi-head prediction can provide more accurate prediction results. Extensive experiments on the standard IQA datasets are conducted. The results show that the proposed method outperforms some baseline NR-IQA methods in IQA performance on the large image datasets. For the model complexity, the proposed method is also superior to several recent NR-IQA methods.
引用
收藏
页码:1947 / 1957
页数:11
相关论文
共 57 条
  • [31] Ponomarenko Nikolay, 2013, 2013 4th European Workshop on Visual Information Processing (EUVIP), P106
  • [32] U-Net: Convolutional Networks for Biomedical Image Segmentation
    Ronneberger, Olaf
    Fischer, Philipp
    Brox, Thomas
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION, PT III, 2015, 9351 : 234 - 241
  • [33] Blind Image Quality Assessment: A Natural Scene Statistics Approach in the DCT Domain
    Saad, Michele A.
    Bovik, Alan C.
    Charrier, Christophe
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2012, 21 (08) : 3339 - 3352
  • [34] MobileNetV2: Inverted Residuals and Linear Bottlenecks
    Sandler, Mark
    Howard, Andrew
    Zhu, Menglong
    Zhmoginov, Andrey
    Chen, Liang-Chieh
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4510 - 4520
  • [35] A statistical evaluation of recent full reference image quality assessment algorithms
    Sheikh, Hamid Rahim
    Sabir, Muhammad Farooq
    Bovik, Alan Conrad
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2006, 15 (11) : 3440 - 3451
  • [36] No-reference quality assessment using natural scene statistics: JPEG2000
    Sheikh, HR
    Bovik, AC
    Cormack, L
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2005, 14 (11) : 1918 - 1927
  • [37] Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network
    Shi, Wenzhe
    Caballero, Jose
    Huszar, Ferenc
    Totz, Johannes
    Aitken, Andrew P.
    Bishop, Rob
    Rueckert, Daniel
    Wang, Zehan
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1874 - 1883
  • [38] Simonyan K, 2015, Arxiv, DOI arXiv:1409.1556
  • [39] Blindly Assess Image Quality in the Wild Guided by A Self-Adaptive Hyper Network
    Su, Shaolin
    Yan, Qingsen
    Zhu, Yu
    Zhang, Cheng
    Ge, Xin
    Sun, Jinqiu
    Zhang, Yanning
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 3664 - 3673
  • [40] No-reference visually significant blocking artifact metric for natural scene images
    Suthaharan, Shan
    [J]. SIGNAL PROCESSING, 2009, 89 (08) : 1647 - 1652