Automatic image captioning system based on augmentation and ranking mechanism

被引：0

作者：

B. S. Revathi

A. Meena Kowshalya

机构：

[1] Government College of Technology,

来源：

Signal, Image and Video Processing | 2024年 / 18卷

关键词：

Image captioning; Deep neural network; Encoder-decoder architecture; Ranking LSTM;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Research on automatically producing syntactically and semantically accurate captions is still an open challenge. This paper proposes an effective pretrained Augmentation–Ranking (A–R) Image Captioning model. The proposed model improves the properties of the images and produces appropriate captions. The employed novel augmentation strategy improves convolution neural network (CNN) operation, while Ranking and Feedback Propagation improve Long Short-Term Memory (LSTM). Our proposed model seeks to address the issues of complexity, vanishing gradients and context during training. The proposed A–R model improves the performance of LSTM and CNN. The image dataset for training is expanded using the augmented CNN. Through ranks, the Ranking LSTM aids in the identification of the semantic captions. This blending method enhances the working of image captioning system. Utilizing greedy and beam search, the proposed A–R model is examined under maximum and average pooling. The outcomes are compared with cutting-edge models such as the bidirectional recurrent neural network, Google NIC and Bi-LSTM combined with semantic attention mechanism. The proposed model is assessed using the Flickr 8 k and Flickr 30 k dataset and assessed using measures including BLEU, METEOR and CIDER. The proposed model with reduced complexity generated captions deemed accurate, syntactically correct and semantically correct by achieving an accuracy of 74.87% above all baseline models, according to experimental results.

引用

页码：265 / 274

页数：9

共 50 条

[41] Show, Recall, and Tell: Image Captioning with Recall Mechanism
Wang, Li
Bai, Zechen
Zhang, Yonghua
Lu, Hongtao
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 12176 - 12183
[42] Image captioning based on dependency syntax
Bi J.
Liu M.
Hu H.
Dai J.
Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics, 2021, 47 (03): : 431 - 440
[43] Reference Based LSTM for Image Captioning
Chen, Minghai
Ding, Guiguang
Zhao, Sicheng
Chen, Hui
Han, Jungong
Liu, Qiang
THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3981 - 3987
[44] Image Captioning Based on Semantic Scenes
Zhao, Fengzhi
Yu, Zhezhou
Wang, Tao
Lv, Yi
ENTROPY, 2024, 26 (10)
[45] Phrase-based Image Captioning
Lebret, Remi
Pinheiro, Pedro O.
Collobert, Ronan
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 2085 - 2094
[46] Automatic Captioning based on Visible and Infrared Images
Wang, Yan
Lou, Shuli
Wang, Kai
Wang, Yunzhe
Yuan, Xiaohu
Liu, Huaping
2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2024), 2024, : 11312 - 11318
[47] Automatic image captioning in Thai for house defect using a deep learning-based approach
Manadda Jaruschaimongkol
Krittin Satirapiwong
Kittipan Pipatsattayanuwong
Suwant Temviriyakul
Ratchanat Sangprasert
Thitirat Siriborvornratanakul
Advances in Computational Intelligence, 2024, 4 (1):
[48] Automatic Arabic Image Captioning using RNN-LSTM-Based Language Model and CNN
Al-Muzaini, Huda A.
Al-Yahya, Tasniem N.
Benhidour, Hafida
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (06) : 67 - 73
[49] A HTTP Botnet Detection System Based on Ranking Mechanism
Lee, Yuan-Chin
Tseng, Chuan-Mu
Liu, Tzong-Jye
2017 TWELFTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT (ICDIM), 2017, : 115 - 120
[50] CIC-BART-SSA: Controllable Image Captioning with Structured Semantic Augmentation
Basioti, Kalliopi
Abdelsalam, Mohamed A.
Fancellu, Federico
Pavlovic, Vladimir
Fazly, Afsaneh
COMPUTER VISION - ECCV 2024, PT LXVI, 2025, 15124 : 444 - 461

← 1 2 3 4 5 →