Image Caption with Global-Local Attention

被引:0
|
作者
Li, Linghui [1 ,2 ]
Tang, Sheng [1 ]
Deng, Lixi [1 ,2 ]
Zhang, Yongdong [1 ]
Tian, Qi [3 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100039, Peoples R China
[3] Univ Texas San Antonio, Dept Comp Sci, San Antonio, TX 78249 USA
来源
THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2017年
基金
北京市自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image caption is becoming important in the field of artificial intelligence. Most existing methods based on CNN-RNN framework suffer from the problems of object missing and misprediction due to the mere use of global representation at image-level. To address these problems, in this paper, we propose a global-local attention (GLA) method by integrating local representation at object-level with global representation at image-level through attention mechanism. Thus, our proposed method can pay more attention to how to predict the salient objects more precisely with high recall while keeping context information at image-level cocurrently. Therefore, our proposed GLA method can generate more relevant sentences, and achieve the state-of-the-art performance on the well-known Microsoft COCO caption dataset with several popular metrics.
引用
收藏
页码:4133 / 4139
页数:7
相关论文
共 50 条
  • [1] Global-local feature attention network with reranking strategy for image caption generation
    吴捷
    谢斯雅
    史新宝
    陈耀文
    OptoelectronicsLetters, 2017, 13 (06) : 448 - 451
  • [2] Global-local feature attention network with reranking strategy for image caption generation
    Wu J.
    Xie S.-Y.
    Shi X.-B.
    Chen Y.-W.
    Chen, Yao-wen (ywchen@stu.edu.cn), 1600, Springer Verlag (13): : 448 - 451
  • [3] Global-Local Feature Attention Network with Reranking Strategy for Image Caption Generation
    Wu, Jie
    Xie, Siya
    Shi, Xinbao
    Chen, Yaowen
    COMPUTER VISION, PT I, 2017, 771 : 157 - 167
  • [4] GLA: Global-Local Attention for Image Description
    Li, Linghui
    Tang, Sheng
    Zhang, Yongdong
    Deng, Lixi
    Tian, Qi
    IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (03) : 726 - 737
  • [5] Global-Local Channel Attention for Hyperspectral Image Classification
    Yan, Peilin
    Qin, Haolin
    Wang, Jihui
    Xu, Tingfa
    Song, Liqiang
    Li, Hui
    Li, Jianan
    INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND ENERGY TECHNOLOGIES (ICECET 2021), 2021, : 1633 - 1638
  • [6] Image captioning based on global-local feature and adaptive-attention
    Zhao X.-H.
    Yin L.-F.
    Zhao C.-L.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2020, 54 (01): : 126 - 134
  • [7] Global-local attention for emotion recognition
    Le, Nhat
    Nguyen, Khanh
    Nguyen, Anh
    Le, Bac
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (24): : 21625 - 21639
  • [8] Global-local attention for emotion recognition
    Nhat Le
    Khanh Nguyen
    Anh Nguyen
    Bac Le
    Neural Computing and Applications, 2022, 34 : 21625 - 21639
  • [9] Global-local graph attention: unifying global and local attention for node classification
    Lin, Keao
    Xie, Xiaozhu
    Weng, Wei
    Du, Xiaofeng
    COMPUTER JOURNAL, 2024, 67 (10): : 2959 - 2969
  • [10] All the attention you need: Global-local, spatial-channel attention for image retrieval
    Song, Chull Hwan
    Han, Hye Joo
    Avrithis, Yannis
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 439 - 448