Chinese Text Classification Method Based on BERT Word Embedding

被引:6
作者
Wang, Ziniu [1 ]
Huang, Zhilin [1 ]
Gao, Jianling [1 ]
机构
[1] Guizhou Univ, Guiyang, Guizhou, Peoples R China
来源
2020 5TH INTERNATIONAL CONFERENCE ON MATHEMATICS AND ARTIFICIAL INTELLIGENCE (ICMAI 2020) | 2020年
关键词
Text Classification; BERT; CapsNet; Word Embedding; BiGRU; Attention mechanism;
D O I
10.1145/3395260.3395273
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper,we enhance the semantic representation of the word through the BERT pre-training language model, dynamically generates the semantic vector according to the context of the character, and then inputs the character vector embedded as a character-level word vector sequence into the CapsNet.We builted the BiGRU module in the capsule network for text feature extraction, and introduced attention mechanism to focus on key information.We use the corpus of baidu's Chinese question and answer data set and only take the types of questions as classified samples to conduct experiments.We used the separate BERT network and the CapsNet as a comparative experiment. Finally, the experimental results show that the model effect is better than using one of the models alone, and the effect is improved.
引用
收藏
页码:66 / 71
页数:6
相关论文
共 22 条
  • [1] A Hinton G, 2008, INT C NEUR INF PROC
  • [2] [Anonymous], STUDIES COMPUTATIONA
  • [3] [Anonymous], 2015, 3 INT C LEARN REPR I
  • [4] Arjen P.de Vries., 2002, SIGMOD '02: Proceedings of the 2002 ACM SIGMOD international conference on Management of data, P322
  • [5] Chen ZG, 2012, INFORMATION-TOKYO, V15, P4255
  • [6] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [7] Frosst N, 2015, ARXIV PREPRINT ARXIV
  • [8] Framewise phoneme classification with bidirectional LSTM and other neural network architectures
    Graves, A
    Schmidhuber, J
    [J]. NEURAL NETWORKS, 2005, 18 (5-6) : 602 - 610
  • [9] Gulcehre C., 2014, PROC C EMPIRICAL MET, P1724
  • [10] Guyon I, NEURAL INFORM PROCES, P5998