Transfer Learning Method for Very Deep CNN for Text Classification and Methods for its Evaluation

被引:16
作者
Moriya, Shun [1 ]
Shibata, Chihiro [1 ]
机构
[1] Tokyo Univ Technol, Dept Comp Sci, Hachioji, Tokyo, Japan
来源
2018 IEEE 42ND ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC 2018), VOL 2 | 2018年
基金
日本学术振兴会;
关键词
transfer learning; text classification; CNN; residual network;
D O I
10.1109/COMPSAC.2018.10220
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In recent years, it has become possible to perform text classification with high accuracy by using convolutional neural networks (CNNs). Zhang et al. decomposed words into characters and classified texts using a CNN with relatively deep layers to obtain excellent classification results. However, it is often difficult to prepare a sufficient number of labeled samples for solving real-world text-classification problems. One method for handling this problem is transfer learning, which uses a network tuned for an arbitrary task as the initial network for a target task. While transfer learning is known to be effective for image recognition, for tasks in natural language processing, such as document classification, it has not yet been shown for what types of data and to what extent transfer learning is effective. In this paper, we first introduce a character-level CNN adopting the structure of a residual network to construct a network with deeper layers for Japanese text classification. We then demonstrate that we can improve classification accuracy by performing transfer learning between two particular datasets. Additionally, we propose an approach to evaluate the effectiveness of transfer learning and use it to evaluate our model.
引用
收藏
页码:153 / 158
页数:6
相关论文
共 11 条
  • [1] XGBoost: A Scalable Tree Boosting System
    Chen, Tianqi
    Guestrin, Carlos
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 785 - 794
  • [2] Conneau A, 2017, 15TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2017), VOL 1: LONG PAPERS, P1107
  • [3] CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
  • [4] Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1026 - 1034
  • [5] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [6] Kim Y., 2014, P 2014 C EMP METH NA, DOI [10.3115/v1/D14-1181, DOI 10.3115/V1/D14-1181]
  • [7] ImageNet Classification with Deep Convolutional Neural Networks
    Krizhevsky, Alex
    Sutskever, Ilya
    Hinton, Geoffrey E.
    [J]. COMMUNICATIONS OF THE ACM, 2017, 60 (06) : 84 - 90
  • [8] Mou L., 2016, EMNLP, P479
  • [9] Japanese Text Classification by Character-level Deep ConvNets and Transfer Learning
    Sato, Minato
    Orihara, Ryohei
    Sei, Yuichi
    Tahara, Yasuyuki
    Ohsuga, Akihiko
    [J]. ICAART: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 2, 2017, : 175 - 184
  • [10] Yosinski J, 2014, ADV NEUR IN, V27