Semantic-CC: Boosting Remote Sensing Image Change Captioning via Foundational Knowledge and Semantic Guidance

被引:0
|
作者
Zhu, Yongshuo [1 ]
Li, Lu [1 ]
Chen, Keyan [1 ,2 ]
Liu, Chenyang [1 ,2 ]
Zhou, Fugen [1 ]
Shi, Zhenwei [1 ,2 ]
机构
[1] Beihang Univ, Image Proc Ctr, Sch Astronaut, Beijing 100191, Peoples R China
[2] Beihang Univ, State Key Lab Virtual Real Technol & Syst, Beijing 100191, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Training; Accuracy; Semantics; Natural languages; Feature extraction; Stability analysis; Decoding; Neck; Sensors; Remote sensing; Change captioning (CC); foundation model; multitask learning (MTL); remote sensing image;
D O I
10.1109/TGRS.2024.3497338
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Remote sensing image change captioning (RSICC) aims to articulate the changes in objects of interest within bitemporal remote sensing images using natural language. Given the limitations of current RSICC methods in expressing general features across multitemporal and spatial scenarios, and their deficiency in providing granular, robust, and precise change descriptions, we introduce a novel change captioning (CC) method based on the foundational knowledge and semantic guidance, which we term Semantic-CC. Semantic-CC alleviates the dependency of high-generalization algorithms on extensive annotations by harnessing the latent knowledge of foundation models, and it generates more comprehensive and accurate change descriptions guided by pixel-level semantics from change detection (CD). Specifically, we propose a bitemporal SAM-based encoder for dual-image feature extraction; a multitask semantic aggregation neck for facilitating information interaction between heterogeneous tasks; a straightforward multiscale CD decoder to provide pixel-level semantic guidance; and a change caption decoder based on the large language model (LLM) to generate change description sentences. Moreover, to ensure the stability of the joint training of CD and CC, we propose a three-stage training strategy that supervises different tasks at various stages. We validate the proposed method on the LEVIR-CC and LEVIR-CD datasets. The experimental results corroborate the complementarity of CD and CC, demonstrating that Semantic-CC can generate more accurate change descriptions and achieve optimal performance across both tasks.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] CCTNet: CNN and Cross-Shaped Transformer Hybrid Network for Remote Sensing Image Semantic Segmentation
    Wu, Honglin
    Zeng, Zhaobin
    Huang, Peng
    Yu, Xinyu
    Zhang, Min
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 19986 - 19997
  • [42] PEGNet: Progressive Edge Guidance Network for Semantic Segmentation of Remote Sensing Images
    Pan, Shaoming
    Tao, Yulong
    Nie, Congchong
    Chong, Yanwen
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2021, 18 (04) : 637 - 641
  • [43] Multiscale Semantic Guidance Network for Object Detection in VHR Remote Sensing Images
    Zhu, Shengyu
    Zhang, Junping
    Liang, Xuejian
    Guo, Qingle
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [44] SEMANTIC DECOUPLED REPRESENTATION LEARNING FOR REMOTE SENSING IMAGE CHANGE DETECTION
    Chen, Hao
    Zao, Yifan
    Liu, Liqin
    Chen, Song
    Shi, Zhenwei
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 1051 - 1054
  • [45] NeighborLoss: A Loss Function Considering Spatial Correlation for Semantic Segmentation of Remote Sensing Image
    Yuan, Wei
    Xu, Wenbo
    IEEE ACCESS, 2021, 9 (09): : 75641 - 75649
  • [46] RanPaste: Paste Consistency and Pseudo Label for Semisupervised Remote Sensing Image Semantic Segmentation
    Wang, Jia-Xin
    Chen, Si-Bao
    Ding, Chris H. Q.
    Tang, Jin
    Luo, Bin
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [47] Decomposition-Based Unsupervised Domain Adaptation for Remote Sensing Image Semantic Segmentation
    Ma, Xianping
    Zhang, Xiaokang
    Ding, Xingchen
    Pun, Man-On
    Ma, Siwei
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [48] DDRNet: Dual-Domain Refinement Network for Remote Sensing Image Semantic Segmentation
    Yang, Zhenhao
    Bi, Fukun
    Hou, Xinghai
    Zhou, Dehao
    Wang, Yanping
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 20177 - 20189
  • [49] MFAFNet: A Multiscale Fully Attention Fusion Network for Remote Sensing Image Semantic Segmentation
    Dang, Yuanyuan
    Gao, Yu
    Liu, Bing
    IEEE ACCESS, 2024, 12 : 123388 - 123400
  • [50] Dual-Dimension Feature Interaction for Semantic Change Detection in Remote Sensing Images
    Wang, Biao
    Jiang, Zhenghao
    Ma, Weichun
    Xu, Xiao
    Zhang, Peng
    Wu, Yanlan
    Yang, Hui
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 9595 - 9605