Semantic-CC: Boosting Remote Sensing Image Change Captioning via Foundational Knowledge and Semantic Guidance

被引:0
|
作者
Zhu, Yongshuo [1 ]
Li, Lu [1 ]
Chen, Keyan [1 ,2 ]
Liu, Chenyang [1 ,2 ]
Zhou, Fugen [1 ]
Shi, Zhenwei [1 ,2 ]
机构
[1] Beihang Univ, Image Proc Ctr, Sch Astronaut, Beijing 100191, Peoples R China
[2] Beihang Univ, State Key Lab Virtual Real Technol & Syst, Beijing 100191, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Training; Accuracy; Semantics; Natural languages; Feature extraction; Stability analysis; Decoding; Neck; Sensors; Remote sensing; Change captioning (CC); foundation model; multitask learning (MTL); remote sensing image;
D O I
10.1109/TGRS.2024.3497338
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Remote sensing image change captioning (RSICC) aims to articulate the changes in objects of interest within bitemporal remote sensing images using natural language. Given the limitations of current RSICC methods in expressing general features across multitemporal and spatial scenarios, and their deficiency in providing granular, robust, and precise change descriptions, we introduce a novel change captioning (CC) method based on the foundational knowledge and semantic guidance, which we term Semantic-CC. Semantic-CC alleviates the dependency of high-generalization algorithms on extensive annotations by harnessing the latent knowledge of foundation models, and it generates more comprehensive and accurate change descriptions guided by pixel-level semantics from change detection (CD). Specifically, we propose a bitemporal SAM-based encoder for dual-image feature extraction; a multitask semantic aggregation neck for facilitating information interaction between heterogeneous tasks; a straightforward multiscale CD decoder to provide pixel-level semantic guidance; and a change caption decoder based on the large language model (LLM) to generate change description sentences. Moreover, to ensure the stability of the joint training of CD and CC, we propose a three-stage training strategy that supervises different tasks at various stages. We validate the proposed method on the LEVIR-CC and LEVIR-CD datasets. The experimental results corroborate the complementarity of CD and CC, demonstrating that Semantic-CC can generate more accurate change descriptions and achieve optimal performance across both tasks.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Recurrent Attention and Semantic Gate for Remote Sensing Image Captioning
    Li, Yunpeng
    Zhang, Xiangrong
    Gu, Jing
    Li, Chen
    Wang, Xin
    Tang, Xu
    Jiao, Licheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [2] MISGNet: A Multilevel Intertemporal Semantic Guidance Network for Remote Sensing Images Change Detection
    Cui, Binge
    Liu, Chenglong
    Li, Haojie
    Yu, Jianzhi
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 : 1827 - 1840
  • [3] Semantic Representations With Attention Networks for Boosting Image Captioning
    Hafeth, Deema Abdal
    Kollias, Stefanos
    Ghafoor, Mubeen
    IEEE ACCESS, 2023, 11 : 40230 - 40239
  • [4] Semantic-Explicit Filtering Network for Remote Sensing Image Change Detection
    Li, Shuying
    Ren, Chao
    Qin, Yuemei
    Li, Qiang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [5] Semantic Information Collaboration Network for Semantic Change Detection in Remote Sensing Images
    Ning, Xiaogang
    He, You
    Zhang, Hanchao
    Zhang, Ruiqian
    Chang, Dong
    Hao, Minghui
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 12893 - 12909
  • [6] Semantic-Spatial Collaborative Perception Network for Remote Sensing Image Captioning
    Wang, Qi
    Yang, Zhigang
    Ni, Weiping
    Wu, Junzheng
    Li, Qiang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [7] KE-RSIC: Remote Sensing Image Captioning Based on Knowledge Embedding
    Cheng, Kangda
    Cambria, Erik
    Liu, Jinlong
    Chen, Yushi
    Wu, Zhilu
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 : 4286 - 4304
  • [8] Enhancing Remote Sensing Semantic Segmentation Accuracy and Efficiency Through Transformer and Knowledge Distillation
    Zheng, Kang
    Chen, Yu
    Wang, Jingrong
    Liu, Zhifei
    Bao, Shuai
    Zhan, Jiao
    Shen, Nan
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 : 4074 - 4092
  • [9] Intertemporal Interaction and Symmetric Difference Learning for Remote Sensing Image Change Captioning
    Li, Yunpeng
    Zhang, Xiangrong
    Cheng, Xina
    Chen, Puhua
    Jiao, Licheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [10] Remote Sensing Semantic Change Detection Model for Improving Objects Completeness
    Yang, Wanying
    Cheng, Yali
    Xu, Wenbo
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2025, 18 : 2526 - 2540