Joint Scence Network and Attention-Guided for Image Captioning

被引:2
|
作者
Zhou, Dongming [1 ]
Yang, Jing [1 ]
Zhang, Canlong [1 ]
Tang, Yanping [2 ]
机构
[1] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin 541000, Peoples R China
[2] Guilin Univ Elect Technol, Sch Comp Sci & Informat Secur, Guilin 541004, Peoples R China
基金
中国国家自然科学基金;
关键词
Image captioning; Attention Network; Graph Convolutional Network; Machine Learning;
D O I
10.1109/ICDM51629.2021.00201
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image captioning is an interesting and challenging task. The previously established image captioning approach is based mainly on the encoder-decoder architecture, but it suffers from problems such as inaccurate captioning information, and the generated captioning sentences are not sufficiently rich. This paper proposes a novel image captioning model that is based on a self-attention network and a scene graph relationship network. First, an improved self-attention network is added to the extraction of visual features to evaluate the effectiveness of image global information for image generation. Then, we design a visual intensity parameter to coordinate the strategies of visual features and language model for word generation. Finally, a graph convolutional network is designed to extract the relationships from the scene information to render the generated caption more exciting and to increase the accuracy of the fine-grained captioning . We demonstrated the satisfactory performance of the model on the MS-COCO and Flickr 30K datasets. The experimental results demonstrate that the proposed model realizes state-of-the-art performance.
引用
收藏
页码:1535 / 1540
页数:6
相关论文
共 50 条
  • [31] AMFNet: An attention-guided generative adversarial network for multi-model image fusion
    Wang, Jing
    Yu, Long
    Tian, Shengwei
    Wu, Weidong
    Zhang, Dezhi
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 78
  • [32] An Attention-Guided Multilayer Feature Aggregation Network for Remote Sensing Image Scene Classification
    Li, Ming
    Lei, Lin
    Tang, Yuqi
    Sun, Yuli
    Kuang, Gangyao
    REMOTE SENSING, 2021, 13 (16)
  • [33] Text-Guided Attention Model for Image Captioning
    Mun, Jonghwan
    Cho, Minsu
    Han, Bohyung
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4233 - 4239
  • [34] Image Inpainting Anti-Forensics Network via Attention-Guided Hierarchical Reconstruction
    Dou, Liyun
    Feng, Guorui
    Qian, Zhenxing
    SYMMETRY-BASEL, 2023, 15 (02):
  • [35] LACN: A lightweight attention-guided ConvNeXt network for low-light image enhancement
    Fan, Saijie
    Liang, Wei
    Ding, Derui
    Yu, Hui
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 117
  • [36] Attention-guided network with hierarchical global priors for low-light image enhancement
    Gong, An
    Li, Zhonghao
    Wang, Heng
    Li, Guangtong
    SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (05) : 2083 - 2091
  • [37] Attention-guided dynamic multi-branch neural network for underwater image enhancement
    Yan, Xiaohong
    Qin, Wenqiang
    Wang, Yafei
    Wang, Guangyuan
    Fu, Xianping
    KNOWLEDGE-BASED SYSTEMS, 2022, 258
  • [38] Two-view attention-guided convolutional neural network for mammographic image classification
    Sun, Lilei
    Wen, Jie
    Wang, Junqian
    Zhao, Yong
    Zhang, Bob
    Wu, Jian
    Xu, Yong
    CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2023, 8 (02) : 453 - 467
  • [39] MRSCAtt: A Spatio-Channel Attention-Guided Network for Mars Rover Image Classification
    Chakravarthy, Anirudh S.
    Roy, Roshan
    Ravirathinam, Praveen
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 1961 - 1970
  • [40] CDAN: Convolutional dense attention-guided network for low-light image enhancement
    Shakibania, Hossein
    Raoufi, Sina
    Khotanlou, Hassan
    DIGITAL SIGNAL PROCESSING, 2025, 156