Understanding Subtitles by Character-Level Sequence-to-Sequence Learning

被引:94
作者
Zhang, Haijun [1 ]
Li, Jingxuan [1 ,2 ]
Ji, Yuzhu [1 ]
Yue, Heng [3 ]
机构
[1] Harbin Inst Technol, Shenzhen Grad Sch, Shenzhen 518055, Peoples R China
[2] Huawei Technol Co Ltd, Shenzhen 518129, Peoples R China
[3] Northeastern Univ, Natl Key Lab Synthet Automat Proc Ind, Shenyang 110004, Peoples R China
关键词
Character-level; neural machine translation; recurrent neural network (RNN); sequence learning; NETWORKS;
D O I
10.1109/TII.2016.2601521
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a character-level sequence-to- sequence learning method, RNNembed. Thismethod allows the system to read raw characters, instead of words generated by preprocessing steps, into a pure single neural network model under an end-to-end framework. Specifically, we embed a recurrent neural network into an encoder-decoder framework and generate character-level sequence representation as input. The dimension of input feature space can be significantly reduced as well as avoiding the need to handle unknown or rare words in sequences. In the language model, we improve the basic structure of a gated recurrent unit by adding an output gate, which is used for filtering out unimportant information involved in the attention scheme of the alignment model. Our proposed method was examined in a large-scale dataset on an English-to-Chinese translation task. Experimental results demonstrate that the proposed approach achieves a translation performance comparable, or close, to conventional word-based and phrase-based systems.
引用
收藏
页码:616 / 624
页数:9
相关论文
共 25 条
[1]   A Survey of Paraphrasing and Textual Entailment Methods [J].
Androutsopoulos, Ion ;
Malakasiotis, Prodromos .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2010, 38 :135-187
[2]  
[Anonymous], P INT C LEARN REPR
[3]  
[Anonymous], 2013, P 2013 C EMPIRICAL M
[4]  
[Anonymous], J INF SCI
[5]  
[Anonymous], 2013, FOUND TRENDS SIGNAL, DOI DOI 10.1561/2000000039
[6]  
[Anonymous], 2009, Advances in neural information processing systems
[7]  
[Anonymous], 2014, P 18 C EMP METH NAT, DOI DOI 10.3115/V1/D14-1082
[8]  
Auli Michael, 2013, P 2013 C EMPIRICAL M, P1044
[9]   A neural probabilistic language model [J].
Bengio, Y ;
Ducharme, R ;
Vincent, P ;
Jauvin, C .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (06) :1137-1155
[10]   Hierarchical phrase-based translation [J].
Chiang, David .
COMPUTATIONAL LINGUISTICS, 2007, 33 (02) :201-228