Story Generation from Images Using Deep Learning

被引：0

作者：

Alnami, Abrar ^{[1
]}

Almasre, Miada ^{[1
]}

Al-Malki, Norah ^{[1
]}

机构：

[1] King Abdulaziz Univ, Jeddah, Saudi Arabia

来源：

INFORMATION, COMMUNICATION AND COMPUTING TECHNOLOGY (ICICCT 2021) | 2021年 / 1417卷

关键词：

Convolutional neural network; Deep learning; Object detection; Image captioning; Long short-term memory; Neural networks; CLASSIFICATION;

D O I：

10.1007/978-3-030-88378-2_16

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently, the problem of creating descriptive captions for images became a significant one. However, human languages' expressivity had been among the challenges that hindered researchers from widely experimenting with creating linguistically rich captions for images. That motivated us to utilize advanced deep learning algorithms to generate captions for images. The researchers proposed an AI model utilizing deep learning and natural language processing algorithms, which has two main components, an image-feature extractor, and a story generator. The researchers trained the first component (image-feature extractor) of the model to predict object names in images. The second component (story-generator) was trained on a custom short descriptive sentence which considered short stories. So, the output from the first component (list of words) will be entered into the second component to generate stories on input images. Thus, when testing the model's performance, a list of names will be entered from the first component so that the second generator arranges them and generates a short story from them. The proposed model developed could generate a short story expressive of an input image as shown by the results of a logical value used on the BLEU scale of 0.59, which further research is planned to improve.

引用

页码：198 / 208

页数：11

共 29 条

[1]

Amritkar C, 2018, 2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA)

[2]

[Anonymous], 2013, International Journal of Advanced Research in Computer and Communication Engineering

[3]

Changki Lee, 2016, Journal of KIISE, V43, P878, DOI 10.5626/JOK.2016.43.8.878

[4]

Chu WT, 2017, P WORKSH MULT UND SO, P39, DOI DOI 10.1145/3132515.3132516

[5]

Ganegedara Thushan, 2018, Natural Language Processing with Tensorflow: Teach Language to Machines Using Python's Deep Learning Library

[6] Fast image captioning using LSTM [J].

Han, Meng ;

Chen, Wenyu ;

Moges, Alemu Dagmawi .

CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (Suppl 3) :S6143-S6155

[7] Object Detection Based on VGG with ResNet Network [J].

Haque, Md Foysal ;

Lim, Hye-Youn ;

Kang, Dae-Seong .

2019 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2019, :586-588

[8]

Hays J, 2008, PROC CVPR IEEE, P3436

[9] Deep Learning for Image-to-Text Generation A technical overview [J].

He, Xiaodong ;

Deng, Li .

IEEE SIGNAL PROCESSING MAGAZINE, 2017, 34 (06) :109-116

[10]

Hoang L, 2019, THESIS

← 1 2 3 →