Automatic image captioning system based on augmentation and ranking mechanism

被引:0
|
作者
B. S. Revathi
A. Meena Kowshalya
机构
[1] Government College of Technology,
来源
关键词
Image captioning; Deep neural network; Encoder-decoder architecture; Ranking LSTM;
D O I
暂无
中图分类号
学科分类号
摘要
Research on automatically producing syntactically and semantically accurate captions is still an open challenge. This paper proposes an effective pretrained Augmentation–Ranking (A–R) Image Captioning model. The proposed model improves the properties of the images and produces appropriate captions. The employed novel augmentation strategy improves convolution neural network (CNN) operation, while Ranking and Feedback Propagation improve Long Short-Term Memory (LSTM). Our proposed model seeks to address the issues of complexity, vanishing gradients and context during training. The proposed A–R model improves the performance of LSTM and CNN. The image dataset for training is expanded using the augmented CNN. Through ranks, the Ranking LSTM aids in the identification of the semantic captions. This blending method enhances the working of image captioning system. Utilizing greedy and beam search, the proposed A–R model is examined under maximum and average pooling. The outcomes are compared with cutting-edge models such as the bidirectional recurrent neural network, Google NIC and Bi-LSTM combined with semantic attention mechanism. The proposed model is assessed using the Flickr 8 k and Flickr 30 k dataset and assessed using measures including BLEU, METEOR and CIDER. The proposed model with reduced complexity generated captions deemed accurate, syntactically correct and semantically correct by achieving an accuracy of 74.87% above all baseline models, according to experimental results.
引用
收藏
页码:265 / 274
页数:9
相关论文
共 50 条
  • [21] Multimodal Data Augmentation for Image Captioning using Diffusion Models
    Xiao, Changrong
    Xu, Sean Xin
    Zhang, Kunpeng
    PROCEEDINGS OF THE 1ST WORKSHOP ON LARGE GENERATIVE MODELS MEET MULTIMODAL APPLICATIONS, LGM3A 2023, 2023, : 23 - 33
  • [22] A Framework for Image Captioning Based on Relation Network and Multilevel Attention Mechanism
    Sharma, Himanshu
    Srivastava, Swati
    NEURAL PROCESSING LETTERS, 2023, 55 (05) : 5693 - 5715
  • [23] Double awareness mechanism based deep learning framework for image captioning
    Gaurav
    Mathur, Pratistha
    JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY, 2023, 26 (06): : 1801 - 1817
  • [24] Sentinel mechanism for visual semantic graph-based image captioning
    Xiao, Fen
    Zhang, Ningru
    Xue, Wenfeng
    Gao, Xieping
    COMPUTERS & ELECTRICAL ENGINEERING, 2024, 119
  • [25] Video Captioning based on Image Captioning as Subsidiary Content
    Vaishnavi, J.
    Narmatha, V
    2022 SECOND INTERNATIONAL CONFERENCE ON ADVANCES IN ELECTRICAL, COMPUTING, COMMUNICATION AND SUSTAINABLE TECHNOLOGIES (ICAECT), 2022,
  • [26] Gender Biases in Automatic Evaluation Metrics for Image Captioning
    Qiu, Haoyi
    Dou, Zi-Yi
    Wang, Tianlu
    Celikyilmaz, Asli
    Peng, Nanyun
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 8358 - 8375
  • [27] Re-evaluating Automatic Metrics for Image Captioning
    Kilickaya, Mert
    Erdem, Aykut
    Ikizler-Cinbis, Nazli
    Erdem, Erkut
    15TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2017), VOL 1: LONG PAPERS, 2017, : 199 - 209
  • [28] Visuals to Text: A Comprehensive Review on Automatic Image Captioning
    Yue Ming
    Nannan Hu
    Chunxiao Fan
    Fan Feng
    Jiangwan Zhou
    Hui Yu
    IEEE/CAA Journal of Automatica Sinica, 2022, 9 (08) : 1339 - 1365
  • [29] Visuals to Text: A Comprehensive Review on Automatic Image Captioning
    Ming, Yue
    Hu, Nannan
    Fan, Chunxiao
    Feng, Fan
    Zhou, Jiangwan
    Yu, Hui
    IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2022, 9 (08) : 1339 - 1365
  • [30] To be an Artist: Automatic Generation on Food Image Aesthetic Captioning
    Zou, Xiaohan
    Lin, Cheng
    Zhang, Yinjia
    Zhao, Qinpei
    2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 779 - 786