Automatic image captioning system based on augmentation and ranking mechanism

被引:0
|
作者
B. S. Revathi
A. Meena Kowshalya
机构
[1] Government College of Technology,
来源
关键词
Image captioning; Deep neural network; Encoder-decoder architecture; Ranking LSTM;
D O I
暂无
中图分类号
学科分类号
摘要
Research on automatically producing syntactically and semantically accurate captions is still an open challenge. This paper proposes an effective pretrained Augmentation–Ranking (A–R) Image Captioning model. The proposed model improves the properties of the images and produces appropriate captions. The employed novel augmentation strategy improves convolution neural network (CNN) operation, while Ranking and Feedback Propagation improve Long Short-Term Memory (LSTM). Our proposed model seeks to address the issues of complexity, vanishing gradients and context during training. The proposed A–R model improves the performance of LSTM and CNN. The image dataset for training is expanded using the augmented CNN. Through ranks, the Ranking LSTM aids in the identification of the semantic captions. This blending method enhances the working of image captioning system. Utilizing greedy and beam search, the proposed A–R model is examined under maximum and average pooling. The outcomes are compared with cutting-edge models such as the bidirectional recurrent neural network, Google NIC and Bi-LSTM combined with semantic attention mechanism. The proposed model is assessed using the Flickr 8 k and Flickr 30 k dataset and assessed using measures including BLEU, METEOR and CIDER. The proposed model with reduced complexity generated captions deemed accurate, syntactically correct and semantically correct by achieving an accuracy of 74.87% above all baseline models, according to experimental results.
引用
收藏
页码:265 / 274
页数:9
相关论文
共 50 条
  • [1] Automatic image captioning system based on augmentation and ranking mechanism
    Revathi, B. S.
    Kowshalya, A. Meena
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (01) : 265 - 274
  • [2] Chittron: An Automatic Bangla Image Captioning System
    Rahman, Matiur
    Mohammed, Nabeel
    Mansoor, Nafees
    Momen, Sifat
    PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE OF INFORMATION AND COMMUNICATION TECHNOLOGY [ICICT-2019], 2019, 154 : 636 - 642
  • [3] Vocabulary Learning Support System Based on Automatic Image Captioning Technology
    Hasnine, Mohammad Nehal
    Flanagan, Brendan
    Akcapinar, Gokhan
    Ogata, Hiroaki
    Mouri, Kousuke
    Uosaki, Noriko
    DISTRIBUTED, AMBIENT AND PERVASIVE INTERACTIONS, 2019, 11587 : 346 - 358
  • [4] Automatic image captioning
    Pan, JY
    Yang, HJ
    Duygulu, P
    Faloutsos, C
    2004 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXP (ICME), VOLS 1-3, 2004, : 1987 - 1990
  • [5] Image Captioning Based on Automatic Constraint Loss
    Xu, Chaoqian
    Zhu, Gengming
    Wang, Lixin
    ICMLC 2019: 2019 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, 2019, : 461 - 465
  • [6] Image captioning with data augmentation using cropping and mask based on attention image
    Iwamura K.
    Louhi Kasahara J.Y.
    Moro A.
    Yamashita A.
    Asama H.
    Seimitsu Kogaku Kaishi/Journal of the Japan Society for Precision Engineering, 2020, 86 (11): : 904 - 910
  • [7] Automatic image captioning system using a deep learning approach
    Deepak, Gerard
    Gali, Sowmya
    Sonker, Abhilash
    Jos, Bobin Cherian
    Sagar, K. V. Daya
    Singh, Charanjeet
    SOFT COMPUTING, 2023,
  • [8] Text Augmentation Using BERT for Image Captioning
    Atliha, Viktar
    Sesok, Dmitrij
    APPLIED SCIENCES-BASEL, 2020, 10 (17):
  • [9] Text Augmentation for Compressed Image Captioning Models
    Atliha, Viktar
    Sesok, Dmitrij
    2022 IEEE OPEN CONFERENCE OF ELECTRICAL, ELECTRONIC AND INFORMATION SCIENCES (ESTREAM), 2022,
  • [10] An ensemble model with attention based mechanism for image captioning
    Al Badarneh, Israa
    Hammo, Bassam H.
    Al-Kadi, Omar
    COMPUTERS & ELECTRICAL ENGINEERING, 2025, 123