MaCon: A Generic Self-Supervised Framework for Unsupervised Multimodal Change Detection

被引:0
作者
Wang, Jian [1 ,2 ]
Yan, Li [1 ,3 ]
Yang, Jianbing [1 ,3 ]
Xie, Hong [1 ,3 ]
Yuan, Qiangqiang [1 ,3 ]
Wei, Pengcheng [1 ,3 ]
Gao, Zhao [6 ]
Zhang, Ce [4 ,7 ]
Atkinson, Peter M. [5 ,8 ,9 ]
机构
[1] Wuhan Univ, Sch Geodesy & Geomat, Hubei Luojia Lab, Wuhan 430079, Peoples R China
[2] Univ Lancaster, Fac Sci & Technol, Lancaster LA1 4YQ, England
[3] Univ Lancaster, Lancaster Environm Ctr, Lancaster LA1 4YQ, England
[4] Univ Bristol, Sch Geog Sci, Bristol BS8 1SS, England
[5] Univ Lancaster, Fac Sci & Technol, Lancaster LA1 4YR, England
[6] Wuhan Univ, Sch Comp Sci, Wuhan 430079, Peoples R China
[7] UK Ctr Ecol & Hydrol, Lancaster LA1 4AP, England
[8] Univ Southampton, Sch Geog & Environm Sci, Southampton SO17 1BJ, England
[9] Tongji Univ, Coll Surveying & Geoinformat, Shanghai 200092, Peoples R China
基金
中国国家自然科学基金;
关键词
Data mining; Training; Image reconstruction; Feature extraction; Sensors; Electronic mail; Earth; Contrastive learning; Accuracy; Transforms; Self-supervised learning; mask reconstruction; contrastive learning; multimodal data; change detection; unsupervised learning; remote sensing; Earth observation; REMOTE-SENSING IMAGES; GRAPH; SAR; REGRESSION; NETWORK;
D O I
10.1109/TIP.2025.3542276
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Change detection(CD) is important for Earth observation, emergency response and time-series understanding. Recently, data availability in various modalities has increased rapidly, and multimodal change detection (MCD) is gaining prominence. Given the scarcity of datasets and labels for MCD, unsupervised approaches are more practical for MCD. However, previous methods typically either merely reduce the gap between multimodal data through transformation or feed the original multimodal data directly into the discriminant network for difference extraction. The former faces challenges in extracting precise difference features. The latter contains the pronounced intrinsic distinction between the original multimodal data; direct extraction and comparison of features usually introduce significant noise, thereby compromising the quality of the resultant difference image. In this article, we proposed the MaCon framework to synergistically distill the common and discrepancy representations. The MaCon framework unifies mask reconstruction (MR) and contrastive learning (CL) self-supervised paradigms, where the MR serves the purpose of transformation while CL focuses on discrimination. Moreover, we presented an optimal sampling strategy in the CL architecture, enabling the CL subnetwork to extract more distinguishable discrepancy representations. Furthermore, we developed an effective silent attention mechanism that not only enhances contrast in output representations but stabilizes the training. Experimental results on both multimodal and monomodal datasets demonstrate that the MaCon framework effectively distills the intrinsic common representations between varied modalities and manifests state-of-the-art performance across both multimodal and monomodal CD. Such findings imply that the MaCon possesses the potential to serve as a unified framework in the CD and relevant fields. Source code will be publicly available once the article is accepted.
引用
收藏
页码:1485 / 1500
页数:16
相关论文
共 78 条
[31]   Adapting Language-Audio Models as Few-Shot Audio Learners [J].
Liang, Jinhua ;
Liu, Xubo ;
Liu, Haohe ;
Phan, Huy ;
Benetos, Emmanouil ;
Plumbley, Mark D. ;
Wang, Wenwu .
INTERSPEECH 2023, 2023, :276-280
[32]   Local Restricted Convolutional Neural Network for Change Detection in Polarimetric SAR Images [J].
Liu, Fang ;
Jiao, Licheng ;
Tang, Xu ;
Yang, Shuyuan ;
Ma, Wenping ;
Hou, Biao .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (03) :818-833
[33]   A Deep Convolutional Coupling Network for Change Detection Based on Heterogeneous Optical and Radar Images [J].
Liu, Jia ;
Gong, Maoguo ;
Qin, Kai ;
Zhang, Puzhao .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (03) :545-559
[34]   Change Detection in Heterogenous Remote Sensing Images via Homogeneous Pixel Transformation [J].
Liu, Zhunga ;
Li, Gang ;
Mercier, Gregoire ;
He, You ;
Pan, Quan .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (04) :1822-1834
[35]   Code-Aligned Autoencoders for Unsupervised Change Detection in Multimodal Remote Sensing Images [J].
Luppino, Luigi Tommaso ;
Hansen, Mads Adrian ;
Kampffmeyer, Michael ;
Bianchi, Filippo Maria ;
Moser, Gabriele ;
Jenssen, Robert ;
Anfinsen, Stian Normann .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (01) :60-72
[36]   Deep Image Translation With an Affinity-Based Change Prior for Unsupervised Multimodal Change Detection [J].
Luppino, Luigi Tommaso ;
Kampffmeyer, Michael ;
Bianchi, Filippo Maria ;
Moser, Gabriele ;
Serpico, Sebastiano Bruno ;
Jenssen, Robert ;
Anfinsen, Stian Normann .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[37]   Unsupervised Image Regression for Heterogeneous Change Detection [J].
Luppino, Luigi Tommaso ;
Bianchi, Filippo Maria ;
Moser, Gabriele ;
Anfinsen, Stian Normann .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2019, 57 (12) :9960-9975
[38]   Change-Aware Sampling and Contrastive Learning for Satellite Images [J].
Mall, Utkarsh ;
Hariharan, Bharath ;
Bala, Kavita .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, :5261-5270
[39]  
Mignotte M., 2022, AI COMPUT SCI ROBOT, P1
[40]   A Fractal Projection and Markovian Segmentation-Based Approach for Multimodal Change Detection [J].
Mignotte, Max .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2020, 58 (11) :8046-8058