Automatic image captioning system based on augmentation and ranking mechanism

被引:0
|
作者
B. S. Revathi
A. Meena Kowshalya
机构
[1] Government College of Technology,
来源
关键词
Image captioning; Deep neural network; Encoder-decoder architecture; Ranking LSTM;
D O I
暂无
中图分类号
学科分类号
摘要
Research on automatically producing syntactically and semantically accurate captions is still an open challenge. This paper proposes an effective pretrained Augmentation–Ranking (A–R) Image Captioning model. The proposed model improves the properties of the images and produces appropriate captions. The employed novel augmentation strategy improves convolution neural network (CNN) operation, while Ranking and Feedback Propagation improve Long Short-Term Memory (LSTM). Our proposed model seeks to address the issues of complexity, vanishing gradients and context during training. The proposed A–R model improves the performance of LSTM and CNN. The image dataset for training is expanded using the augmented CNN. Through ranks, the Ranking LSTM aids in the identification of the semantic captions. This blending method enhances the working of image captioning system. Utilizing greedy and beam search, the proposed A–R model is examined under maximum and average pooling. The outcomes are compared with cutting-edge models such as the bidirectional recurrent neural network, Google NIC and Bi-LSTM combined with semantic attention mechanism. The proposed model is assessed using the Flickr 8 k and Flickr 30 k dataset and assessed using measures including BLEU, METEOR and CIDER. The proposed model with reduced complexity generated captions deemed accurate, syntactically correct and semantically correct by achieving an accuracy of 74.87% above all baseline models, according to experimental results.
引用
收藏
页码:265 / 274
页数:9
相关论文
共 50 条
  • [31] Evaluating the effectiveness of automatic image captioning for web accessibility
    Leotta, Maurizio
    Mori, Fabrizio
    Ribaudo, Marina
    UNIVERSAL ACCESS IN THE INFORMATION SOCIETY, 2023, 22 (04) : 1293 - 1313
  • [32] Evaluating the effectiveness of automatic image captioning for web accessibility
    Maurizio Leotta
    Fabrizio Mori
    Marina Ribaudo
    Universal Access in the Information Society, 2023, 22 : 1293 - 1313
  • [33] Neural ranking for automatic image annotation
    Zhang, Weifeng
    Hu, Hua
    Hu, Haiyang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (17) : 22385 - 22406
  • [34] Neural ranking for automatic image annotation
    Weifeng Zhang
    Hua Hu
    Haiyang Hu
    Multimedia Tools and Applications, 2018, 77 : 22385 - 22406
  • [35] Salient Feature Extraction Mechanism for Image Captioning
    Wang X.
    Song Y.-H.
    Zhang Y.-L.
    Zidonghua Xuebao/Acta Automatica Sinica, 2022, 48 (03): : 735 - 746
  • [36] A ranking-based model for automatic image annotation in social networks
    Denoyer, Ludovic
    Gallinari, Patrick
    CORIA 2010: Actes de la COnference en Recherche d'Information et Applications - Proceedings of the Conference on Information Retrieval and Applications, 2010, : 115 - 129
  • [37] Memory-Based Augmentation Network for Video Captioning
    Jing, Shuaiqi
    Zhang, Haonan
    Zeng, Pengpeng
    Gao, Lianli
    Song, Jingkuan
    Shen, Heng Tao
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 2367 - 2379
  • [38] Diffusion Based Augmentation for Captioning and Retrieval in Cultural Heritage
    Cioni, Dario
    Berlincioni, Lorenzo
    Becattini, Federico
    del Bimbo, Alberto
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 1699 - 1708
  • [39] Improved image reconstruction from brain activity through automatic image captioning
    Kalantari, Fatemeh
    Faez, Karim
    Amindavar, Hamidreza
    Nazari, Soheila
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [40] Improving Automatic Image Captioning Using Text Summarization Techniques
    Plaza, Laura
    Lloret, Elena
    Aker, Ahmet
    TEXT, SPEECH AND DIALOGUE, 2010, 6231 : 165 - +