Image Caption Generation with Local Semantic and Global Information

被引:0
作者
Liu, Xing [1 ]
Liu, Weibin [1 ]
Xing, Weiwei [2 ]
机构
[1] Beijing Jiaotong Univ, Inst Informat Sci, Beijing, Peoples R China
[2] Beijing Jiaotong Univ, Sch Softwar Engeneering, Beijing, Peoples R China
来源
2019 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI 2019) | 2019年
基金
中国国家自然科学基金;
关键词
component; image caption; computer vision; feature extraction; LSTM; image representation;
D O I
10.1109/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00152
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Different regions in the image would play different roles in the image description domain, while some key information exists in a small region or some importance features need to be extracted from the whole image. Generally, we only use CNN to extract the features of an image and then utilize those features to generate the description of the image. However, this method is easy to ignore some importance information in the image. In this paper, we propose an image description method which combines the local information and global features of an image. The local information is extracted by a target detection model (SSD) and the global feature is extracted by the multi-instance learning (MIL) method. Our model which works with the above two methods has a good performance on the public dataset MS-COCO.
引用
收藏
页码:680 / 685
页数:6
相关论文
共 24 条
[21]  
Vinyals O, 2015, PROC CVPR IEEE, P3156, DOI 10.1109/CVPR.2015.7298935
[22]  
Xu K, 2015, PR MACH LEARN RES, V37, P2048
[23]   Boosting Image Captioning with Attributes [J].
Yao, Ting ;
Pan, Yingwei ;
Li, Yehao ;
Qiu, Zhaofan ;
Mei, Tao .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4904-4912
[24]   Image Captioning with Semantic Attention [J].
You, Quanzeng ;
Jin, Hailin ;
Wang, Zhaowen ;
Fang, Chen ;
Luo, Jiebo .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :4651-4659