DUAL ATTENTION ENHANCED TRANSFORMER FOR IMAGE DEFOCUS DEBLURRING

被引：0

作者：

He, Yuhang ^{[1
]}

Tian, Senmao ^{[1
]}

Zhang, Jian ^{[1
]}

Zhang, Shunli ^{[1
]}

机构：

[1] Beijing Jiaotong Univ, Sch Software Engn, Beijing, Peoples R China

来源：

2024 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP | 2024年

基金：

中国国家自然科学基金; 北京市自然科学基金;

关键词：

Image Defocus Deblurring; dual self-attention; transformer; auxiliary enhanced attention;

D O I：

10.1109/ICIP51287.2024.10648112

中图分类号：

学科分类号：

摘要：

Image Defocus Deblurring remains a challenging problem due to the uncertainty of the blurred region and the varying depth of field. Although the convolutional neural network (CNN) has achieved promising results on this task, its limited receptive field and static weights hinder the restoration performance. In contrast, Transformer models are able to mitigate the weaknesses of CNN. However, recent Transformer-based models that deal with Image Defocus Deblurring only utilize self-attention from either spatial or channel dimension, which neglects cross-dimensional information essential for restoration. In this paper, we propose a novel Transformer model, Dual Attention Enhanced Transformer (DAEformer), for Image Defocus Deblurring. DAEformer combines self-attention from both spatial and channel dimensions, meanwhile applying auxiliary enhanced attention modules. We present Spatial Attention Enhanced Block (SAEB) and Channel Attention Enhanced Block (CAEB), which not only fuse the spatial and channel information within blocks but also enhance details. Furthermore, we design a progressive hierarchical architecture that applies SAEB/CAEB at different levels to model distinct information and facilitate fusion across blocks. Experimental results demonstrate that DAEformer can achieve state-of-the-art results on the dual-pixel dataset.

引用

页码：1487 / 1493

页数：7

共 26 条

[1] Abuolaim Abdullah, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12355), P111, DOI 10.1007/978-3-030-58607-2_7
[2] Ashish V., 2017, ADV NEURAL INFORM PR, V30, P5999
[3] Dual Aggregation Transformer for Image Super-Resolution
Chen, Zheng
Zhang, Yulun
Gu, Jinjin
Kong, Linghe
Yang, Xiaokang
Yu, Fisher
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 12278 - 12287
[4] Dauphin YN, 2017, PR MACH LEARN RES, V70
[5] DaViT: Dual Attention Vision Transformers
Ding, Mingyu
Xiao, Bin
Codella, Noel
Luo, Ping
Wang, Jingdong
Yuan, Lu
[J]. COMPUTER VISION, ECCV 2022, PT XXIV, 2022, 13684 : 74 - 92
[6] Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
Ding, Xiaohan
Zhang, Xiangyu
Han, Jungong
Ding, Guiguang
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11953 - 11965
[7] Dosovitskiy A., 2021, ICLR
[8] El-Nouby A, 2021, Arxiv, DOI arXiv:2106.09681
[9] Deep Ordinal Regression Network for Monocular Depth Estimation
Fu, Huan
Gong, Mingming
Wang, Chaohui
Batmanghelich, Kayhan
Tao, Dacheng
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2002 - 2011
[10] Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/CVPR.2018.00745, 10.1109/TPAMI.2019.2913372]

← 1 2 3 →