PROGRESSIVE SCALE-AWARE NETWORK FOR REMOTE SENSING IMAGE CHANGE CAPTIONING

被引:20
作者
Liu, Chenyang [1 ,3 ]
Yang, Jiajun [1 ,3 ]
Qi, Zipeng [1 ,3 ]
Zou, Zhengxia [2 ,3 ]
Shi, Zhenwei [1 ,3 ]
机构
[1] Beihang Univ, Image Proc Ctr, Sch Astronaut, Beijing 100191, Peoples R China
[2] Beihang Univ, Dept Guidance Nav & Control, Sch Astronaut, Beijing 100191, Peoples R China
[3] Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China
来源
IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM | 2023年
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Remote sensing image; change captioning; Transformer; scale-aware reinforcement;
D O I
10.1109/IGARSS52108.2023.10283451
中图分类号
P [天文学、地球科学];
学科分类号
07 ;
摘要
Remote sensing (RS) images contain numerous objects of different scales, which poses significant challenges for the RS image change captioning (RSICC) task to identify visual changes of interest in complex scenes and describe them via language. However, current methods still have some weaknesses in sufficiently extracting and utilizing multi-scale information. In this paper, we propose a progressive scale-aware network (PSNet) to address the problem. PSNet is a pure Transformer-based model. To sufficiently extract multi-scale visual features, multiple progressive difference perception (PDP) layers are stacked to progressively exploit the differencing features of bitemporal features. To sufficiently utilize the extracted multi-scale features for captioning, we propose a scale-aware reinforcement (SR) module and combine it with the Transformer decoding layer to progressively utilize the features from different PDP layers. Experiments show that the PDP layer and SR module are effective and our PSNet outperforms previous methods.
引用
收藏
页码:6668 / 6671
页数:4
相关论文
共 13 条
[1]  
Ba J.L., 2016, Layer Normalization
[2]   Unequal adaptive visual recognition by learning from multi-modal data [J].
Cai, Ziyun ;
Zhang, Tengfei ;
Jing, Xiao-Yuan ;
Shao, Ling .
INFORMATION SCIENCES, 2022, 600 :1-21
[3]   CAPTIONING CHANGES IN BI-TEMPORAL REMOTE SENSING IMAGES [J].
Chouaf, Seloua ;
Hoxha, Genc ;
Smara, Youcef ;
Melgani, Farid .
2021 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM IGARSS, 2021, :2891-2894
[4]  
Dosovitskiy A., 2020, ICLR 2021
[5]  
HOXHA G, 2022, IEEE T GEOSCI REMOTE, P1, DOI DOI 10.1109/M2GARSS52314.2022.9840136
[6]   A Novel Pixel Orientation Estimation Based Line Segment Detection Framework, and Its Applications to SAR Images [J].
Liu, Chenguang ;
Liu, Cuiling ;
Wang, Chisheng ;
Zhu, Wu ;
Li, Qingquan .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[7]   Extending On-Chain Trust to Off-Chain - Trustworthy Blockchain Data Collection Using Trusted Execution Environment (TEE) [J].
Liu, Chunchi ;
Guo, Hechuan ;
Xu, Minghui ;
Wang, Shengling ;
Yu, Dongxiao ;
Yu, Jiguo ;
Cheng, Xiuzhen .
IEEE TRANSACTIONS ON COMPUTERS, 2022, 71 (12) :3268-3280
[8]   Robust Change Captioning [J].
Park, Dong Huk ;
Darrell, Trevor ;
Rohrbach, Anna .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :4623-4632
[9]   Describing and Localizing Multiple Changes with Transformers [J].
Qiu, Yue ;
Yamamoto, Shintaro ;
Nakashima, Kodai ;
Suzuki, Ryota ;
Iwata, Kenji ;
Kataoka, Hirokatsu ;
Satoh, Yutaka .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :1951-1960
[10]   Change Detection Based on Artificial Intelligence: State-of-the-Art and Challenges [J].
Shi, Wenzhong ;
Zhang, Min ;
Zhang, Rui ;
Chen, Shanxiong ;
Zhan, Zhao .
REMOTE SENSING, 2020, 12 (10)