Automatic Indonesian Image Caption Generation using CNN-LSTM Model and FEEH-ID Dataset

被引:0
作者
Mulyanto, Edy [1 ]
Setiawan, Esther Irawati [2 ]
Yuniarno, Eko Mulyanto [3 ]
Purnomo, Mauridhi Hery [3 ]
机构
[1] Univ Dian Nuswantoro, Inst Teknol Sepuluh Nopember, Dept Elect Engn, Dept Informat Engn, Surabaya, Indonesia
[2] Sekolah Tinggi Tekn Surabaya, Inst Teknol Sepuluh Nopember, Dept Elect Engn, Dept Informat Engn, Surabaya, Indonesia
[3] Inst Teknol Sepuluh Nopember, Dept Elect Engn, Dept Comp Engn, Surabaya, Indonesia
来源
2019 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND VIRTUAL ENVIRONMENTS FOR MEASUREMENT SYSTEMS AND APPLICATIONS (CIVEMSA 2019) | 2019年
关键词
image captioning; CNN; LSTM; FEEH-ID;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image captioning is a challenge in computer vision research. This paper extends research on automatic image captioning generation in the Indonesian dimension. Description in Indonesian sentences is generated for unlabeled images. The dataset used is FEEH-ID, this is the first Indonesian image captioning dataset. This research is crucial due to unavailability of a corpus for image captioning in Indonesian. This paper will compare the experimental results in the FEEH-ID dataset with English, Chinese and Japanese datasets using the CNN and LSTM models. The performance of the model proposed in the test set provides promising results of 50.0 for the BLEU-1 score and 23.9 for BLEU-3, which is above average of the Bleu evaluation results in other language datasets. The merging model between CNN and LSTM displays pretty good results for the FEEH-ID dataset. The experimental results will he better with a larger dataset.
引用
收藏
页码:151 / 155
页数:5
相关论文
共 21 条
  • [11] Luo Ziyue, 2016, CHINESE IMAGE CAPTIO
  • [12] Learning like a Child: Fast Novel Visual Concept Learning from Sentence Descriptions of Images
    Mao, Junhua
    Xu, Wei
    Yang, Yi
    Wang, Jiang
    Huang, Zhiheng
    Yuille, Alan L.
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2533 - 2541
  • [13] Miyazaki T, 2016, PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, P1780
  • [14] Pan J., 2004, IEEE INT C MULT EXP, V3
  • [15] BLEU: a method for automatic evaluation of machine translation
    Papineni, K
    Roukos, S
    Ward, T
    Zhu, WJ
    [J]. 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2002, : 311 - 318
  • [16] Peng H., 2016, GENERATING CHINESE C
  • [17] Ramakrishna V., 2015, IEEE C COMP VIS PATT
  • [18] Rashtchian C., 2010, P NAACL HLT 2010 WOR, P139
  • [19] ImageNet Large Scale Visual Recognition Challenge
    Russakovsky, Olga
    Deng, Jia
    Su, Hao
    Krause, Jonathan
    Satheesh, Sanjeev
    Ma, Sean
    Huang, Zhiheng
    Karpathy, Andrej
    Khosla, Aditya
    Bernstein, Michael
    Berg, Alexander C.
    Fei-Fei, Li
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 115 (03) : 211 - 252
  • [20] Tanti M., 2017, PUT IMAGE IMAGE CAPT