Global-Local Feature Attention Network with Reranking Strategy for Image Caption Generation

被引:2
|
作者
Wu, Jie [1 ]
Xie, Siya [1 ]
Shi, Xinbao [1 ]
Chen, Yaowen [2 ]
机构
[1] Shantou Univ, Coll Engn, 243 Daxue Rd, Shantou, Peoples R China
[2] Shantou Univ, Key Lab Digital Signal & Image Proc Guangdong, 243 Daxue Rd, Shantou, Peoples R China
来源
COMPUTER VISION, PT I | 2017年 / 771卷
关键词
Image caption; Global-local feature attention network; Reranking strategy; Candidate captions;
D O I
10.1007/978-981-10-7299-4_13
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a novel framework, named global-local feature attention network with reranking strategy (GLAN-RS), is presented for image captioning task. Rather than only adopt unitary visual information in the classical models, GLAN-RS explore attention mechanism to capture local convolutional salient image maps. Furthermore, we adopt reranking strategy to adjust the priority of the candidate captions and select the best one. The proposed model is verified using the MSCOCO benchmark dataset across seven standard evaluation metrics. Experimental results show that GLAN-RS significantly outperforms the state-of-the-art approaches such as M-RNN, Google NIC etc., which gets an improvement of 20% in terms of BLEU4 score and 13 points in terms of CIDER score.
引用
收藏
页码:157 / 167
页数:11
相关论文
共 50 条
  • [1] Global-local feature attention network with reranking strategy for image caption generation
    Wu J.
    Xie S.-Y.
    Shi X.-B.
    Chen Y.-W.
    Chen, Yao-wen (ywchen@stu.edu.cn), 1600, Springer Verlag (13): : 448 - 451
  • [2] Global-local feature attention network with reranking strategy for image caption generation
    吴捷
    谢斯雅
    史新宝
    陈耀文
    OptoelectronicsLetters, 2017, 13 (06) : 448 - 451
  • [3] Image Caption with Global-Local Attention
    Li, Linghui
    Tang, Sheng
    Deng, Lixi
    Zhang, Yongdong
    Tian, Qi
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4133 - 4139
  • [4] Neural Image Caption Generation with Global Feature Based Attention Scheme
    Wang, Yongzhuang
    Xiong, Hongkai
    IMAGE AND GRAPHICS (ICIG 2017), PT II, 2017, 10667 : 51 - 61
  • [5] Local Attribute Attention Network for Minority Clothing Image Caption Generation
    Xuhui Z.
    Li L.
    Xiaodong F.
    Lijun L.
    Wei P.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2024, 36 (03): : 399 - 412
  • [6] Image captioning based on global-local feature and adaptive-attention
    Zhao X.-H.
    Yin L.-F.
    Zhao C.-L.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2020, 54 (01): : 126 - 134
  • [7] A global-local feature adaptive fusion network for image scene classification
    Lv, Guangrui
    Dong, Lili
    Zhang, Wenwen
    Xu, Wenhai
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (03) : 6521 - 6554
  • [8] A global-local feature adaptive fusion network for image scene classification
    Guangrui Lv
    Lili Dong
    Wenwen Zhang
    Wenhai Xu
    Multimedia Tools and Applications, 2024, 83 : 6521 - 6554
  • [9] GLA: Global-Local Attention for Image Description
    Li, Linghui
    Tang, Sheng
    Zhang, Yongdong
    Deng, Lixi
    Tian, Qi
    IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (03) : 726 - 737
  • [10] Image Caption Generation with Local Semantic and Global Information
    Liu, Xing
    Liu, Weibin
    Xing, Weiwei
    2019 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI 2019), 2019, : 680 - 685