Hierarchical multi-attention networks for document classification

被引:29
作者
Huang, Yingren [1 ,2 ]
Chen, Jiaojiao [2 ]
Zheng, Shaomin [2 ]
Xue, Yun [2 ]
Hu, Xiaohui [2 ]
机构
[1] Guangdong Univ Foreign Studies, Lab Language Engn & Comp, Guangzhou, Guangdong, Peoples R China
[2] South China Normal Univ, Guangdong Prov Key Lab Quantum Engn & Quantum Mat, Sch Phys & Telecommun Engn, Guangzhou 510006, Peoples R China
关键词
Document classification; Hierarchical network; Bi-GRU; Attention mechanism; SYSTEM;
D O I
10.1007/s13042-020-01260-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Research of document classification is ongoing to employ the attention based-deep learning algorithms and achieves impressive results. Owing to the complexity of the document, classical models, as well as single attention mechanism, fail to meet the demand of high-accuracy classification. This paper proposes a method that classifies the document via the hierarchical multi-attention networks, which describes the document from the word-sentence level and the sentence-document level. Further, different attention strategies are performed on different levels, which enables accurate assigning of the attention weight. Specifically, the soft attention mechanism is applied to the word-sentence level while the CNN-attention to the sentence-document level. Due to the distinctiveness of the model, the proposed method delivers the highest accuracy compared to other state-of-the-art methods. In addition, the attention weight visualization outcomes present the effectiveness of attention mechanism in distinguishing the importance.
引用
收藏
页码:1639 / 1647
页数:9
相关论文
共 29 条
[11]   A Convolutional Attention Model for Text Classification [J].
Du, Jiachen ;
Gui, Lin ;
Xu, Ruifeng ;
He, Yulan .
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2017, 2018, 10619 :183-195
[12]   APPROXIMATION OF DYNAMICAL-SYSTEMS BY CONTINUOUS-TIME RECURRENT NEURAL NETWORKS [J].
FUNAHASHI, K ;
NAKAMURA, Y .
NEURAL NETWORKS, 1993, 6 (06) :801-806
[13]   Convolutional Recurrent Deep Learning Model for Sentence Classification [J].
Hassan, Abdalraouf ;
Mahmood, Ausif .
IEEE ACCESS, 2018, 6 :13949-13957
[14]  
Iyyer M, 2015, PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1, P1681
[15]  
Karpathy Andrej, 2015, Visualizing and understanding recurrent networks
[16]  
Kim Y, 2014, arXiv, P1746
[17]  
Li, 2019, COMPUT APPL, V40, P651
[18]  
Maas Andrew, 2011, P 49 ANN M ASS COMP, P142
[19]   Nursing-care freestyle text classification using support vector machines [J].
Nii, Manabu ;
Ando, Shigeru ;
Takahashi, Yutaka ;
Uchinuno, Atsuko ;
Sakashita, Reiko .
GRC: 2007 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, PROCEEDINGS, 2007, :665-+
[20]   A Bayesian computer vision system for modeling human interactions [J].
Oliver, NM ;
Rosario, B ;
Pentland, AP .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2000, 22 (08) :831-843