Cross-Scale KNN Image Transformer for Image Restoration

被引:3
|
作者
Lee, Hunsang [1 ]
Choi, Hyesong [2 ]
Sohn, Kwanghoon [1 ]
Min, Dongbo [2 ]
机构
[1] Yonsei Univ, Sch Elect & Elect Engn, Seoul, South Korea
[2] Ewha Womans Univ, Dept Comp Sci & Engn, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
Image restoration; Transformers; Noise reduction; Complexity theory; Computer vision; Convolutional neural networks; Feature extraction; denoising; deblurring; deraining; transformer; self-attention; k-nn search; low-level vision; ALGORITHMS; NETWORK;
D O I
10.1109/ACCESS.2023.3242556
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Numerous image restoration approaches have been proposed based on attention mechanism, achieving superior performance to convolutional neural networks (CNNs) based counterparts. However, they do not leverage the attention model in a form fully suited to the image restoration tasks. In this paper, we propose an image restoration network with a novel attention mechanism, called cross-scale $k$ -NN image Transformer (CS-KiT), that effectively considers several factors such as locality, non-locality, and cross-scale aggregation, which are essential to image restoration. To achieve locality and non-locality, the CS-KiT builds $k$ -nearest neighbor relation of local patches and aggregates similar patches through local attention. To induce cross-scale aggregation, we ensure that each local patch embraces different scale information with scale-aware patch embedding (SPE) which predicts an input patch scale through a combination of multi-scale convolution branches. We show the effectiveness of the CS-KiT with experimental results, outperforming state-of-the-art restoration approaches on image denoising, deblurring, and deraining benchmarks.
引用
收藏
页码:13013 / 13027
页数:15
相关论文
共 50 条
  • [31] A Prior Guided Wavelet-Spatial Dual Attention Transformer Framework for Heavy Rain Image Restoration
    Zhang, Ronghui
    Yu, Jiongze
    Chen, Junzhou
    Li, Guofa
    Lin, Liang
    Wang, Danwei
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 7043 - 7057
  • [32] GoLDFormer: A global-local deformable window transformer for efficient image restoration
    Chen, Quan
    Zheng, Bolun
    Yan, Chenggang
    Zhu, Zunjie
    Wang, Tingyu
    Slabaugh, Gregory
    Yuan, Shanxin
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 100
  • [33] Cross-Modal Transformers for Infrared and Visible Image Fusion
    Park, Seonghyun
    Vien, An Gia
    Lee, Chul
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (02) : 770 - 785
  • [34] Bridging CNN and Transformer With Cross-Attention Fusion Network for Hyperspectral Image Classification
    Xu, Fulin
    Mei, Shaohui
    Zhang, Ge
    Wang, Nan
    Du, Qian
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [35] CrossFormer plus plus : A Versatile Vision Transformer Hinging on Cross-Scale Attention
    Wang, Wenxiao
    Chen, Wei
    Qiu, Qibo
    Chen, Long
    Wu, Boxi
    Lin, Binbin
    He, Xiaofei
    Liu, Wei
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (05) : 3123 - 3136
  • [36] Integration Transformer for Ground-Based Cloud Image Segmentation
    Liu, Shuang
    Zhang, Jiafeng
    Zhang, Zhong
    Cao, Xiaozhong
    Durrani, Tariq S.
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [37] Global-Local Multigranularity Transformer for Hyperspectral Image Classification
    Meng, Zhe
    Yan, Qian
    Zhao, Feng
    Chen, Gaige
    Hua, Wenqiang
    Liang, Miaomiao
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 : 112 - 131
  • [38] When Fast Fourier Transform Meets Transformer for Image Restoration
    Jiang, Xingyu
    Zhang, Xiuhui
    Gao, Ning
    Deng, Yue
    COMPUTER VISION - ECCV 2024, PT XLV, 2025, 15103 : 381 - 402
  • [39] Seeing the Unseen: A Frequency Prompt Guided Transformer for Image Restoration
    Zhou, Shihao
    Pan, Jinshan
    Shi, Jinglei
    Chen, Duosheng
    Qu, Lishen
    Yang, Jufeng
    COMPUTER VISION - ECCV 2024, PT XVI, 2025, 15074 : 246 - 264
  • [40] Cross-scale cascade transformer for multimodal human action recognition
    Liu, Zhen
    Cheng, Qin
    Song, Chengqun
    Cheng, Jun
    PATTERN RECOGNITION LETTERS, 2023, 168 : 17 - 23