DCAT: Dual Cross-Attention-Based Transformer for Change Detection

被引:12
作者
Zhou, Yuan [1 ,2 ]
Huo, Chunlei [1 ,2 ,3 ]
Zhu, Jiahang [1 ,2 ]
Huo, Leigang [4 ]
Pan, Chunhong [2 ]
机构
[1] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 101408, Peoples R China
[2] Chinese Acad Sci, Inst Automation, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
[3] Univ Sci & Technol Beijing, Sch Automation & Elect Engn, Beijing 100083, Peoples R China
[4] Nanning Normal Univ, Sch Comp & Informat Engn, Nanning 530001, Peoples R China
基金
中国国家自然科学基金;
关键词
change detection; transformer; dual cross-attention; remote sensing; NETWORK;
D O I
10.3390/rs15092395
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Several transformer-based methods for change detection (CD) in remote sensing images have been proposed, with Siamese-based methods showing promising results due to their two-stream feature extraction structure. However, these methods ignore the potential of the cross-attention mechanism to improve change feature discrimination and thus, may limit the final performance. Additionally, using either high-frequency-like fast change or low-frequency-like slow change alone may not effectively represent complex bi-temporal features. Given these limitations, we have developed a new approach that utilizes the dual cross-attention-transformer (DCAT) method. This method mimics the visual change observation procedure of human beings and interacts with and merges bi-temporal features. Unlike traditional Siamese-based CD frameworks, the proposed method extracts multi-scale features and models patch-wise change relationships by connecting a series of hierarchically structured dual cross-attention blocks (DCAB). DCAB is based on a hybrid dual branch mixer that combines convolution and transformer to extract and fuse local and global features. It calculates two types of cross-attention features to effectively learn comprehensive cues with both low- and high-frequency information input from paired CD images. This helps enhance discrimination between the changed and unchanged regions during feature extraction. The feature pyramid fusion network is more lightweight than the encoder and produces powerful multi-scale change representations by aggregating features from different layers. Experiments on four CD datasets demonstrate the advantages of DCAT architecture over other state-of-the-art methods.
引用
收藏
页数:30
相关论文
共 68 条
[1]   Edge-Guided Recurrent Convolutional Neural Network for Multitemporal Remote Sensing Image Building Change Detection [J].
Bai, Beifang ;
Fu, Wei ;
Lu, Ting ;
Li, Shutao .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[2]  
Bromley J., 1993, International Journal of Pattern Recognition and Artificial Intelligence, V7, P669, DOI 10.1142/S0218001493000339
[3]  
Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
[4]  
Chen H., 2021, arXiv
[5]   Adversarial Instance Augmentation for Building Change Detection in Remote Sensing Images [J].
Chen, Hao ;
Li, Wenyuan ;
Shi, Zhenwei .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[6]   Remote Sensing Image Change Detection With Transformers [J].
Chen, Hao ;
Qi, Zipeng ;
Shi, Zhenwei .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[7]   A Spatial-Temporal Attention-Based Method and a New Dataset for Remote Sensing Image Change Detection [J].
Chen, Hao ;
Shi, Zhenwei .
REMOTE SENSING, 2020, 12 (10)
[8]   DASNet: Dual Attentive Fully Convolutional Siamese Networks for Change Detection in High-Resolution Satellite Images [J].
Chen, Jie ;
Yuan, Ziyang ;
Peng, Jian ;
Chen, Li ;
Huang, Haozhe ;
Zhu, Jiawei ;
Liu, Yu ;
Li, Haifeng .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 :1194-1206
[9]   A Region-Based Feature Fusion Network for VHR Image Change Detection [J].
Chen, Pan ;
Li, Cong ;
Zhang, Bing ;
Chen, Zhengchao ;
Yang, Xuan ;
Lu, Kaixuan ;
Zhuang, Lina .
REMOTE SENSING, 2022, 14 (21)
[10]  
Daudt RC, 2018, IEEE IMAGE PROC, P4063, DOI 10.1109/ICIP.2018.8451652