Bilinear Pooling of Transformer Embeddings for Blind Image Quality Assessment

被引:0
|
作者
Feng, Yeli [1 ]
机构
[1] Nanyang Technol Univ, Singapore, Singapore
关键词
Vision transformer; Bilinear pooling; Blind image quality assessment; Authentic distortions;
D O I
10.1007/978-981-97-3559-4_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Blind image quality assessment finds its practical usage in real-world applications where image distortions are more complex than computer generated synthetic distortions, but high-quality images are not available for reference. In the past decade, research in blind quality prediction has advanced tremendously thanks to the success of convolutional neural networks. However, it is far from human-like performance and remains a challenging research problem. For the first time, this paper investigates the potential of imagenet pre-trained Vision Transformer, a new generation architecture for image understanding, in providing better quality aware features. This paper proposed BPTIQ, a method that leverages multi-level transformer embeddings with bilinear feature pooling and non-monotonic error regularization for blind quality assessment of authentic distortions. The effectiveness of the proposed method was evaluated with four IQA databases with authentic distortions. Experimental outcomes and ablation studies show that the performance of BPTIQ is competitive with nine state-of-the-art IQA methods in comparison that mainly utilized pre-trained convolutional neural networks for feature extraction. BPTIQ performed the best over two of the four single databases and demonstrated a more robust cross-database generalization capability.
引用
收藏
页码:137 / 150
页数:14
相关论文
共 50 条
  • [1] Deep Activation Pooling for Blind Image Quality Assessment
    Zhang, Zhong
    Wang, Hong
    Liu, Shuang
    Durrani, Tariq S.
    APPLIED SCIENCES-BASEL, 2018, 8 (04):
  • [2] Blind Image Quality Assessment via Vector Regression and Object Oriented Pooling
    Gu, Jie
    Meng, Gaofeng
    Redi, Judith A.
    Xiang, Shiming
    Pan, Chunhong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (05) : 1140 - 1153
  • [3] Blind image quality assessment via learnable attention-based pooling
    Gu, Jie
    Meng, Gaofeng
    Xiang, Shiming
    Pan, Chunhong
    PATTERN RECOGNITION, 2019, 91 : 332 - 344
  • [4] Blind Image Quality Assessment Using a Deep Bilinear Convolutional Neural Network
    Zhang, Weixia
    Ma, Kede
    Yan, Jia
    Deng, Dexiang
    Wang, Zhou
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (01) : 36 - 47
  • [5] Dual-branch vision transformer for blind image quality assessment*
    Lee, Se-Ho
    Kim, Seung-Wook
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 94
  • [6] TRANSFORMER FOR IMAGE QUALITY ASSESSMENT
    You, Junyong
    Korhonen, Jari
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 1389 - 1393
  • [7] Image Quality Assessment via Adaptive Pooling
    Zhang, Zhong
    Liu, Shuang
    Li, Ao
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, 2016, 386 : 709 - 714
  • [8] Visual Importance Pooling for Image Quality Assessment
    Moorthy, Anush Krishna
    Bovik, Alan Conrad
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2009, 3 (02) : 193 - 201
  • [9] DEEP BLIND SYNTHESIZED IMAGE QUALITY ASSESSMENT WITH CONTEXTUAL MULTI-LEVEL FEATURE POOLING
    Wang, Xiaochuan
    Wang, Kai
    Yang, Bailin
    Li, Frederick W. B.
    Liang, Xiaohui
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 435 - 439
  • [10] Blind Image Quality Assessment via Transformer Predicted Error Map and Perceptual Quality Token
    Shi, Jinsong
    Gao, Pan
    Smolic, Aljosa
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 4641 - 4651