Research on image text generation based on word2vec visual vocabulary attention

被引:0
作者
Li, Danyang [1 ]
Zhao, Yahui [1 ]
Cui, Rongyi [1 ]
Zhao, Linlin [1 ]
机构
[1] Yanbian Univ, Intelligent Informat Proc Lab, Dept Comp Sci & Technol, Yanji, Jilin, Peoples R China
来源
2021 ASIA-PACIFIC CONFERENCE ON COMMUNICATIONS TECHNOLOGY AND COMPUTER SCIENCE (ACCTCS 2021) | 2021年
关键词
word2vec; Image2text; Image captions; Attention;
D O I
10.1109/ACCTCS52002.2021.00075
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A method of image text generation based on the combination of word2vec keyword extraction and attention mechanism is proposed. First, the co-occurring words with visual entities in the description set were extracted for each image in the dataset; Then the similarity was calculated for the extracted keywords, the similar words were filtered out to expand the keyword list, and the words in the vocabulary were retained to create new descriptions for the images. Finally, the test set images were combined with attention mechanism to generate description text. The experiments prove that the method proposed in this paper can achieve automatic annotation of images and can effectively solve the attention diffusion problem in the process of image text generation.
引用
收藏
页码:344 / 348
页数:5
相关论文
共 32 条
  • [21] Cross Corpus Speech Emotion Recognition using transfer learning and attention-based fusion of Wav2Vec2 and prosody features
    Naderi, Navid
    Nasersharif, Babak
    KNOWLEDGE-BASED SYSTEMS, 2023, 277
  • [22] Multi-modality helps in crisis management: An attention-based deep learning approach of leveraging text for image classification
    Ahmad, Zishan
    Jindal, Raghav
    Mukuntha, N. S.
    Ekbal, Asif
    Bhattachharyya, Pushpak
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 195
  • [23] Text-Enhanced Attribute-Based Attention for Generalized Zero-Shot Fine-Grained Image Classification
    Chen, Yan-He
    Yeh, Mei-Chen
    PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 447 - 450
  • [24] A Hierarchical Attention Based Seq2Seq Model for Chinese Lyrics Generation
    Fan, Haoshen
    Wang, Jie
    Zhuang, Bojin
    Wang, Shaojun
    Xiao, Jing
    PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III, 2019, 11672 : 279 - 288
  • [25] Exploring the potential of Wav2vec 2.0 for speech emotion recognition using classifier combination and attention-based feature fusion
    Nasersharif, Babak
    Namvarpour, Mohammad
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (16) : 23667 - 23688
  • [26] History-based attention in Seq2Seq model for multi-label text classification
    Xiao, Yaoqiang
    Li, Yi
    Yuan, Jin
    Guo, Songrui
    Xiao, Yi
    Li, Zhiyong
    KNOWLEDGE-BASED SYSTEMS, 2021, 224
  • [27] GAGPT-2: A Geometric Attention-based GPT-2 Framework for Image Captioning in Hindi
    Mishra, Santosh Kumar
    Chakraborty, Soham
    Saha, Sriparna
    Bhattacharyya, Pushpak
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (10)
  • [28] The role of individual factors in L2 vocabulary learning with cognitive-linguistics-based static and dynamic visual aids
    Sato, Takeshi
    Lai, Yuda
    Burden, Tyler
    RECALL, 2022, 34 (02) : 201 - 217
  • [29] Extracting salient object from remote sensing image based on guidance of visual attention - art. no. 67902W
    Xu, Gang
    Huo, Hong
    Fang, Tao
    Li, Deren
    REMOTE SENSING AND GIS DATA PROCESSING AND APPLICATIONS; AND INNOVATIVE MULTISPECTRAL TECHNOLOGY AND APPLICATIONS, PTS 1 AND 2, 2007, 6790 : W7902 - W7902
  • [30] Obj-SA-GAN: Object-Driven Text-to-Image Synthesis with Self-Attention Based Full Semantic Information Mining
    Li, Ruijun
    Li, Weihua
    Yang, Yi
    Bai, Quan
    PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I, 2022, 13629 : 339 - 350