Enhancing THAI Image Captioning Performance using CNN and Bidirectional LSTM

被引:0
|
作者
Tieancho, Witchaphon [1 ]
Phumeechanya, Sopon [1 ]
机构
[1] Silpakorn Univ, Fac Engn & Ind Technol, Dept Elect Engn, Nakhon Pathom, Thailand
来源
2024 21ST INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY, ECTI-CON 2024 | 2024年
关键词
Thai captions; convolutional neural network; bidirectional LSTM; BLEU;
D O I
10.1109/ECTI-CON60892.2024.10595011
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This research has designed a deep learning model to create Thai captions using a convolutional neural network (CNN) in VGG16 format to extract image features, and it is used to procreate captions using bidirectional LSTM. The data warehouse used for training and testing is Flickr8k, which combines customized traffic-related image and caption information. For the first set of data, that is Flickr8k, all subtitles had to be translated from English to Thai using Google Translate, and ways to deal with the data before training were to remove special characters to prevent the Thai language description from being distorted. Then, to evaluate the result of the captions the model produced compared to default captions, the BLEU metric was used to measure the score. The resulting average score was effective because it was higher than the compared models. The score values were paralleled up to 4 grams.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] A New Attention-Based LSTM for Image Captioning
    Fen Xiao
    Wenfeng Xue
    Yanqing Shen
    Xieping Gao
    Neural Processing Letters, 2022, 54 : 3157 - 3171
  • [32] A New Attention-Based LSTM for Image Captioning
    Xiao, Fen
    Xue, Wenfeng
    Shen, Yanqing
    Gao, Xieping
    NEURAL PROCESSING LETTERS, 2022, 54 (04) : 3157 - 3171
  • [33] ThaiTC:Thai Transformer-based Image Captioning
    Jaknamon, Teetouch
    Marukatat, Sanparith
    2022 17TH INTERNATIONAL JOINT SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING (ISAI-NLP 2022) / 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INTERNET OF THINGS (AIOT 2022), 2022,
  • [34] Next-LSTM: a novel LSTM-based image captioning technique
    Singh, Priya
    Kumar, Chandan
    Kumar, Ayush
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2023, 14 (04) : 1492 - 1503
  • [35] Next-LSTM: a novel LSTM-based image captioning technique
    Priya Singh
    Chandan Kumar
    Ayush Kumar
    International Journal of System Assurance Engineering and Management, 2023, 14 : 1492 - 1503
  • [36] A Hybrid of Deep CNN and Bidirectional LSTM for Automatic Speech Recognition
    Passricha, Vishal
    Aggarwal, Rajesh Kumar
    JOURNAL OF INTELLIGENT SYSTEMS, 2020, 29 (01) : 1261 - 1274
  • [37] Image captioning with triple-attention and stack parallel LSTM
    Zhu, Xinxin
    Li, Lixiang
    Liu, Jing
    Li, Ziyi
    Peng, Haipeng
    Niu, Xinxin
    NEUROCOMPUTING, 2018, 319 : 55 - 65
  • [38] Attention Based Double Layer LSTM for Chinese Image Captioning
    Wu, Wei
    Sun, Deshuai
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [39] phi-LSTM: A Phrase-Based Hierarchical LSTM Model for Image Captioning
    Tan, Ying Hua
    Chan, Chee Seng
    COMPUTER VISION - ACCV 2016, PT V, 2017, 10115 : 101 - 117
  • [40] Recurrent Neural Networks for Image Captioning: A Case Study with LSTM
    Mohite, Shailaja Sanjay
    Suganthini, C.
    Arunarani, A. R.
    Devi, K. Lalitha
    Sharma, Manish
    Patil, R. N.
    Shrivastava, Anurag
    JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (03) : 1082 - 1092