Story Generation from Images Using Deep Learning

被引:0
作者
Alnami, Abrar [1 ]
Almasre, Miada [1 ]
Al-Malki, Norah [1 ]
机构
[1] King Abdulaziz Univ, Jeddah, Saudi Arabia
来源
INFORMATION, COMMUNICATION AND COMPUTING TECHNOLOGY (ICICCT 2021) | 2021年 / 1417卷
关键词
Convolutional neural network; Deep learning; Object detection; Image captioning; Long short-term memory; Neural networks; CLASSIFICATION;
D O I
10.1007/978-3-030-88378-2_16
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, the problem of creating descriptive captions for images became a significant one. However, human languages' expressivity had been among the challenges that hindered researchers from widely experimenting with creating linguistically rich captions for images. That motivated us to utilize advanced deep learning algorithms to generate captions for images. The researchers proposed an AI model utilizing deep learning and natural language processing algorithms, which has two main components, an image-feature extractor, and a story generator. The researchers trained the first component (image-feature extractor) of the model to predict object names in images. The second component (story-generator) was trained on a custom short descriptive sentence which considered short stories. So, the output from the first component (list of words) will be entered into the second component to generate stories on input images. Thus, when testing the model's performance, a list of names will be entered from the first component so that the second generator arranges them and generates a short story from them. The proposed model developed could generate a short story expressive of an input image as shown by the results of a logical value used on the BLEU scale of 0.59, which further research is planned to improve.
引用
收藏
页码:198 / 208
页数:11
相关论文
共 29 条
  • [1] Amritkar C, 2018, 2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA)
  • [2] Changki Lee, 2016, Journal of KIISE, V43, P878, DOI 10.5626/JOK.2016.43.8.878
  • [3] Chu W.-T., 2017, MUSA2 2017 PROC WORK, P39, DOI DOI 10.1145/3132515.3132516
  • [4] Ganegedara T., 2018, Natural Language Processing with Tensorflow: Teach Language to Machines Using Python's Deep Learning Library
  • [5] Fast image captioning using LSTM
    Han, Meng
    Chen, Wenyu
    Moges, Alemu Dagmawi
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (Suppl 3): : S6143 - S6155
  • [6] Object Detection Based on VGG with ResNet Network
    Haque, Md Foysal
    Lim, Hye-Youn
    Kang, Dae-Seong
    [J]. 2019 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2019, : 586 - 588
  • [7] Hays J, 2008, PROC CVPR IEEE, P3436
  • [8] Deep Learning for Image-to-Text Generation A technical overview
    He, Xiaodong
    Deng, Li
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2017, 34 (06) : 109 - 116
  • [9] Hoang L, 2019, THESIS
  • [10] Hossain M.A., 2019, Glob. J. Comput. Sci. Technol.