Image to Text Conversion: State of the Art and Extended Work

被引:9
作者
Farhani, Nada [1 ]
Terbeh, Naim [1 ]
Zrigui, Mounir [1 ]
机构
[1] LaTICE Lab, Monastir, Tunisia
来源
2017 IEEE/ACS 14TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA) | 2017年
关键词
Modality; learning; image processing; automatic phrase generation; PTT Conversion;
D O I
10.1109/AICCSA.2017.159
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The aim of this article is to study the conversion of information between the different modalities (text, image) due to the evolution of human-machine communication that introduced the use of natural communication modalities to humans such as gestures, speech, sound and vision. In fact, one of the main challenges of this "multimodal" learning is the learning of a shared representation between the distinct modalities and the prediction of the missing data (for example, by retrieval or synthesis) from a conditioned modality to another. Some researches work on the different types of conversions; Text to Speech, Speech to Picture or Text to Picture synthesis and vice versa but in this paper we will focus on: Text to Picture (TTP) and Picture to Text (PTT) synthesis.
引用
收藏
页码:937 / 943
页数:7
相关论文
共 46 条
[1]  
[Anonymous], COMM ACM
[2]  
[Anonymous], 2015, NIPS
[3]  
[Anonymous], P BRIT MACH VIS C
[4]  
[Anonymous], IEEE T PATTERN ANAL
[5]  
[Anonymous], 2004, P ACL WORKSH TEXT SU
[6]  
[Anonymous], 2015, CVPR
[7]  
[Anonymous], IEEE T IMAGE PROCESS
[8]  
[Anonymous], IEEE T BROADCASTING
[9]  
[Anonymous], 2008, P EUR C COMP VIS
[10]  
[Anonymous], 2015, ICCV