Japanese Text Classification by Character-level Deep ConvNets and Transfer Learning

被引:7
作者
Sato, Minato [1 ]
Orihara, Ryohei [1 ]
Sei, Yuichi [1 ]
Tahara, Yasuyuki [1 ]
Ohsuga, Akihiko [1 ]
机构
[1] Univ Electrocommun, Grad Sch Informat Syst, Tokyo, Japan
来源
ICAART: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 2 | 2017年
关键词
Deep Learning; Temporal ConvNets; Transfer Learning; Text Classification; Sentiment Analysis;
D O I
10.5220/0006193401750184
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Temporal (one-dimensional) Convolutional Neural Network ( Temporal CNN, ConvNet) is an emergent technology for text understanding. The input for the ConvNets could be either a sequence of words or a sequence of characters. In the latter case there are no needs for natural language processing that depends on a language such as morphological analysis. Past studies showed that the character-level ConvNets worked well for news category classification and sentiment analysis / classification tasks in English and romanized Chinese text corpus. In this article we apply the character-level ConvNets to Japanese text understanding. We also attempt to reuse meaningful representations that are learned in the ConvNets from a large-scale dataset in the form of transfer learning, inspired by its success in the field of image recognition. As for the application to the news category classification and the sentiment analysis and classification tasks in Japanese text corpus, the ConvNets outperformed N-gram-based classifiers. In addition, our ConvNets transfer learning frameworks worked well for a task which is similar to one used for pre-training.
引用
收藏
页码:175 / 184
页数:10
相关论文
共 28 条
[1]  
[Anonymous], 2013, P 2013 C N AM CHAPTE
[2]  
[Anonymous], 2010, P 13 INT C ART INT S
[3]  
[Anonymous], 2016, ABS160502688 THEAN D
[4]  
[Anonymous], P 2013 IEEE INT C AC
[5]  
[Anonymous], 2014, P COLING 2014 25 INT, DOI DOI 10.1109/ICCAR.2017.7942788
[6]  
[Anonymous], 2005, P 14 INT C WORLD WID, DOI DOI 10.1145/1060745.1060764
[7]  
[Anonymous], 2009, P 2009 IEEE C COMP V
[8]  
[Anonymous], P 13 EUR C COMP
[9]  
[Anonymous], 2014, PROCEED
[10]  
Chollet Francois., 2015, Keras