Exploring Multi-Level Attention and Semantic Relationship for Remote Sensing Image Captioning

被引:37
|
作者
Yuan, Zhenghang [1 ]
Li, Xuelong [1 ]
Wang, Qi [1 ]
机构
[1] Northwestern Polytech Univ, Ctr Opt Imagery Anal & Learning, Sch Comp Sci, Xian 710072, Peoples R China
来源
IEEE ACCESS | 2020年 / 8卷 / 08期
基金
中国国家自然科学基金;
关键词
Remote sensing image; image captioning; deep learning; graph convolutional networks (GCNs); semantic understanding;
D O I
10.1109/ACCESS.2019.2962195
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Remote sensing image captioning, which aims to understand high-level semantic information and interactions of different ground objects, is a new emerging research topic in recent years. Though image captioning has developed rapidly with convolutional neural networks (CNNs) and recurrent neural networks (RNNs), the image captioning task for remote sensing images still suffers from two main limitations. One limitation is that the scales of objects in remote sensing images vary dramatically, which makes it difficult to obtain an effective image representation. Another limitation is that the visual relationship in remote sensing images is still underused, which should have great potential to improve the final performance. In order to deal with these two limitations, an effective framework for captioning the remote sensing image is proposed in this paper. The framework is based on multi-level attention and multi-label attribute graph convolution. Specifically, the proposed multi-level attention module can adaptively focus not only on specific spatial features, but also on features of specific scales. Moreover, the designed attribute graph convolution module can employ the attribute-graph to learn more effective attribute features for image captioning. Extensive experiments are conducted and the proposed method achieves superior performance on UCM-captions, Sydney-captions and RSICD dataset.
引用
收藏
页码:2608 / 2620
页数:13
相关论文
共 50 条
  • [1] Recurrent Attention and Semantic Gate for Remote Sensing Image Captioning
    Li, Yunpeng
    Zhang, Xiangrong
    Gu, Jing
    Li, Chen
    Wang, Xin
    Tang, Xu
    Jiao, Licheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [2] A Multi-Level Attention Model for Remote Sensing Image Captions
    Li, Yangyang
    Fang, Shuangkang
    Jiao, Licheng
    Liu, Ruijiao
    Shang, Ronghua
    REMOTE SENSING, 2020, 12 (06)
  • [3] Multi-level semantic-aware transformer for image captioning
    Xu, Qin
    Song, Shan
    Wu, Qihang
    Jiang, Bo
    Luo, Bin
    Tang, Jinhui
    NEURAL NETWORKS, 2025, 187
  • [4] Remote Sensing Image Segmentation Method Based on Multi-Level Channel Attention
    Yu Shuai
    Wang Xili
    LASER & OPTOELECTRONICS PROGRESS, 2020, 57 (04)
  • [5] Image Captioning with multi-level similarity-guided semantic matching
    Li, Jiesi
    Xu, Ning
    Nie, Weizhi
    Zhang, Shenyuan
    VISUAL INFORMATICS, 2021, 5 (04): : 41 - 48
  • [6] Multi-label semantic feature fusion for remote sensing image captioning
    Wang, Shuang
    Ye, Xiutiao
    Gu, Yu
    Wang, Jihui
    Meng, Yun
    Tian, Jingxian
    Hou, Biao
    Jiao, Licheng
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2022, 184 : 1 - 18
  • [7] Multi-Source Interactive Stair Attention for Remote Sensing Image Captioning
    Zhang, Xiangrong
    Li, Yunpeng
    Wang, Xin
    Liu, Feixiang
    Wu, Zhaoji
    Cheng, Xina
    Jiao, Licheng
    REMOTE SENSING, 2023, 15 (03)
  • [8] Exploring region features in remote sensing image captioning
    Zhao, Kai
    Xiong, Wei
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2024, 127
  • [9] Image Captioning with Semantic Attention
    You, Quanzeng
    Jin, Hailin
    Wang, Zhaowen
    Fang, Chen
    Luo, Jiebo
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4651 - 4659
  • [10] Feature refinement and rethinking attention for remote sensing image captioning
    Li, Yunpeng
    Tao, Chengjin
    Liu, Meng
    Zhang, Xiangrong
    Wang, Guanchun
    Zhang, Tianyang
    Zhao, Dong
    Wang, Dabao
    SCIENTIFIC REPORTS, 2025, 15 (01):