Image Dehazing Transformer with Transmission-Aware 3D Position Embedding

被引:308
作者
Guo, Chunle [1 ]
Yan, Qixin [2 ]
Anwar, Saeed [3 ]
Cong, Runmin [4 ]
Ren, Wenqi [5 ]
Li, Chongyi [6 ]
机构
[1] Nankai Univ, TMCC, CS, Tianjin, Peoples R China
[2] Tianjin Univ, Tianjin, Peoples R China
[3] Australian Natl Univ, Canberra, ACT, Australia
[4] Beijing Jiaotong Univ, Beijing, Peoples R China
[5] Sun Yat Sen Univ, Guangzhou, Peoples R China
[6] Nanyang Technol Univ, S Lab, Singapore, Singapore
来源
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) | 2022年
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
D O I
10.1109/CVPR52688.2022.00572
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite single image dehazing has been made promising progress with Convolutional Neural Networks (CNNs), the inherent equivariance and locality of convolution still bottleneck dehazing performance. Though Transformer has occupied various computer vision tasks, directly leveraging Transformer for image dehazing is challenging: I) it tends to result in ambiguous and coarse details that are undesired for image reconstruction; 2) previous position embedding of Transformer is provided in logic or spatial position order that neglects the variational haze densities, which results in the sub-optimal dehazing performance. The key insight of this study is to investigate how to combine CNN and Transformer for image dehazing. To solve the feature inconsistency issue between Transformer and CNN, we propose to modulate CNN features via learning modulation matrices (i.e., coefficient matrix and bias matrix) conditioned on Transformer features instead of simple feature addition or concatenation. The feature modulation naturally inherits the global context modeling capability of Transformer and the local representation capability of CNN. We bring a haze density-related prior into Transformer via a novel transmission-aware 3D position embedding module, which not only provides the relative position but also suggests the haze density of different spatial regions. Extensive experiments demonstrate that our method, DeHamer, attains state-of-the-art performance on several image dehazing benchmarks.
引用
收藏
页码:5802 / 5810
页数:9
相关论文
共 37 条
[1]   NH-HAZE: An Image Dehazing Benchmark with Non-Homogeneous Hazy and Haze-Free Images [J].
Ancuti, Codruta O. ;
Ancuti, Cosmin ;
Timofte, Radu .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, :1798-1805
[2]  
Ancuti CO, 2019, IEEE IMAGE PROC, P1014, DOI [10.1109/ICIP.2019.8803046, 10.1109/icip.2019.8803046]
[3]  
[Anonymous], 2021, CVPR, DOI DOI 10.1109/CVPR46437.2021.01212
[4]  
[Anonymous], 2020, AAAI
[5]   Non-Local Image Dehazing [J].
Berman, Dana ;
Treibitz, Tali ;
Avidan, Shai .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1674-1682
[6]   DehazeNet: An End-to-End System for Single Image Haze Removal [J].
Cai, Bolun ;
Xu, Xiangmin ;
Jia, Kui ;
Qing, Chunmei ;
Tao, Dacheng .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (11) :5187-5198
[7]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[8]   PSD: Principled Synthetic-to-Real Dehazing Guided by Physical Priors [J].
Chen, Zeyuan ;
Wang, Yangchao ;
Yang, Yang ;
Liu, Dong .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :7176-7185
[9]   Multi-Scale Boosted Dehazing Network with Dense Feature Fusion [J].
Dong, Hang ;
Pan, Jinshan ;
Xiang, Lei ;
Hu, Zhe ;
Zhang, Xinyi ;
Wang, Fei ;
Yang, Ming-Hsuan .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :2154-2164
[10]  
Dosovitskiy A, 2020, ARXIV