Overview of Character-Based Models for Natural Language Processing

被引:3
作者
Adel, Heike [1 ]
Asgari, Ehsaneddin [1 ,2 ]
Schuetze, Hinrich [1 ]
机构
[1] Ludwig Maximilians Univ Munchen, Ctr Informat & Language Proc, Munich, Germany
[2] Univ Calif Berkeley, Appl Sci & Technol, Berkeley, CA 94720 USA
来源
COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2017), PT I | 2018年 / 10761卷
关键词
Natural language processing; Neural networks; Document representation; Feature selection; Natural language generation; Language models; Structured prediction; Supervised learning by classification; SPEECH RECOGNITION;
D O I
10.1007/978-3-319-77113-7_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Character-based models become more and more popular for different natural language processing task, especially due to the success of neural networks. They provide the possibility of directly model text sequences without the need of tokenization and, therefore, enhance the traditional preprocessing pipeline. This paper provides an overview of character-based models for a variety of natural language processing tasks. We group existing work in three categories: tokenization-based approaches, bag-of-n-gram models and end-to-end models. For each category, we present prominent examples of studies with a particular focus on recent character-based deep learning work.
引用
收藏
页码:3 / 16
页数:14
相关论文
共 104 条
[1]  
Ali Basha Shaik M., 2013, Annual Conference of the International Speech Communication Association, P3404
[2]  
[Anonymous], ANN M ASS COMP LING
[3]  
[Anonymous], 940273 MCCS COMP RES
[4]  
[Anonymous], ABS161003017 CORR
[5]  
[Anonymous], 2016, ABS160908144 CORR
[6]  
[Anonymous], 2015, ABS151104586 CORR
[7]  
[Anonymous], ABS160205772 CORR
[8]  
[Anonymous], 2013, ABS13080850 CORR
[9]  
[Anonymous], COMPUTATIONAL LINGUI
[10]  
[Anonymous], ANN M ASS COMP LING