DUAL ATTENTION ENHANCED TRANSFORMER FOR IMAGE DEFOCUS DEBLURRING

被引:0
作者
He, Yuhang [1 ]
Tian, Senmao [1 ]
Zhang, Jian [1 ]
Zhang, Shunli [1 ]
机构
[1] Beijing Jiaotong Univ, Sch Software Engn, Beijing, Peoples R China
来源
2024 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP | 2024年
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Image Defocus Deblurring; dual self-attention; transformer; auxiliary enhanced attention;
D O I
10.1109/ICIP51287.2024.10648112
中图分类号
学科分类号
摘要
Image Defocus Deblurring remains a challenging problem due to the uncertainty of the blurred region and the varying depth of field. Although the convolutional neural network (CNN) has achieved promising results on this task, its limited receptive field and static weights hinder the restoration performance. In contrast, Transformer models are able to mitigate the weaknesses of CNN. However, recent Transformer-based models that deal with Image Defocus Deblurring only utilize self-attention from either spatial or channel dimension, which neglects cross-dimensional information essential for restoration. In this paper, we propose a novel Transformer model, Dual Attention Enhanced Transformer (DAEformer), for Image Defocus Deblurring. DAEformer combines self-attention from both spatial and channel dimensions, meanwhile applying auxiliary enhanced attention modules. We present Spatial Attention Enhanced Block (SAEB) and Channel Attention Enhanced Block (CAEB), which not only fuse the spatial and channel information within blocks but also enhance details. Furthermore, we design a progressive hierarchical architecture that applies SAEB/CAEB at different levels to model distinct information and facilitate fusion across blocks. Experimental results demonstrate that DAEformer can achieve state-of-the-art results on the dual-pixel dataset.
引用
收藏
页码:1487 / 1493
页数:7
相关论文
共 26 条
  • [1] Abuolaim Abdullah, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12355), P111, DOI 10.1007/978-3-030-58607-2_7
  • [2] Ashish V., 2017, ADV NEURAL INFORM PR, V30, P5999
  • [3] Dual Aggregation Transformer for Image Super-Resolution
    Chen, Zheng
    Zhang, Yulun
    Gu, Jinjin
    Kong, Linghe
    Yang, Xiaokang
    Yu, Fisher
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 12278 - 12287
  • [4] Dauphin YN, 2017, PR MACH LEARN RES, V70
  • [5] DaViT: Dual Attention Vision Transformers
    Ding, Mingyu
    Xiao, Bin
    Codella, Noel
    Luo, Ping
    Wang, Jingdong
    Yuan, Lu
    [J]. COMPUTER VISION, ECCV 2022, PT XXIV, 2022, 13684 : 74 - 92
  • [6] Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
    Ding, Xiaohan
    Zhang, Xiangyu
    Han, Jungong
    Ding, Guiguang
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11953 - 11965
  • [7] Dosovitskiy A., 2021, ICLR
  • [8] El-Nouby A, 2021, Arxiv, DOI arXiv:2106.09681
  • [9] Deep Ordinal Regression Network for Monocular Depth Estimation
    Fu, Huan
    Gong, Mingming
    Wang, Chaohui
    Batmanghelich, Kayhan
    Tao, Dacheng
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2002 - 2011
  • [10] Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/CVPR.2018.00745, 10.1109/TPAMI.2019.2913372]