Incorporating external knowledge for image captioning using CNN and LSTM

被引:27
作者
Sharma, Himanshu [1 ]
Jalal, Anand Singh [1 ]
机构
[1] GLA Univ Mathura, Dept Comp Engn & Applicat, Mathura 281406, Uttar Pradesh, India
来源
MODERN PHYSICS LETTERS B | 2020年 / 34卷 / 28期
关键词
Image processing; natural language processing; commonsense reasoning; World Knowledge; knowledge bases; ATTENTION; NETWORK;
D O I
10.1142/S0217984920503157
中图分类号
O59 [应用物理学];
学科分类号
摘要
Image captioning is a multidisciplinary artificial intelligence (AI) research task that has captures the interest of both image and natural language processing experts. Image captioning is a complex problem as it sometimes requires accessing the information that may not be directly visualized in a given scene. It possibly will require common sense interpretation or the detailed knowledge about the object present in image. In this paper, we have given a method that utilizes both visual and external knowledge from knowledge bases such as ConceptNet for better description the images. We demonstrated the usefulness of the method on two publicly available datasets; Flickr8k and Flickr30k.The results explain that the proposed model outperforms the state-of-the art approaches for generating image captions. At last, we will talk about possible future prospects in image captioning.
引用
收藏
页数:12
相关论文
共 51 条
[1]   Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering [J].
Anderson, Peter ;
He, Xiaodong ;
Buehler, Chris ;
Teney, Damien ;
Johnson, Mark ;
Gould, Stephen ;
Zhang, Lei .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6077-6086
[2]   A simple neural network approach to invariant image recognition [J].
Andrecut, M ;
Ali, MK .
MODERN PHYSICS LETTERS B, 2001, 15 (01) :11-17
[3]  
Aneja J., 2018, COMPUTER VISION PATT
[4]  
[Anonymous], 2011, ARXIV14090473
[5]  
[Anonymous], 2014, Adv. Neural Inf. Process. Syst.
[6]  
[Anonymous], 2014, T ASSOC COMPUT LING
[7]  
[Anonymous], 2015, VERY DEEP CONVOLUTIO
[8]  
[Anonymous], 2015, FINITE ELEMENT METHO
[9]  
[Anonymous], INT C LEARN REPR ICL
[10]   Deep learning approach for microarray cancer data classification [J].
Basavegowda, Hema Shekar ;
Dagnew, Guesh .
CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2020, 5 (01) :22-33