Self-Attention Enhanced Recurrent Neural Networks for Sentence Classification

被引:0
作者
Kumar, Ankit [1 ]
Rastogi , Reshma [2 ]
机构
[1] South Asian Univ, New Delhi, India
[2] South Asian Univ, Dept Comp Sci, New Delhi, India
来源
2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI) | 2018年
关键词
Sentence Classification; Recurrent Neural Network; Bidirectional Recurrent Neural Network; Long Short-Term Memory; Gated Recurrent Unit; Self-Attention;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we propose self-attention enhanced Recurrent Neural Networks for the task of sentence classification. The proposed framework is based on Vanilla Recurrent Neural Network and Bi-directional Recurrent Neural Network architecture. These architectures have been implemented over two different recurrent cells namely Long Short-Term Memory and Gated Recurrent Unit. We have used the multi-head self-attention mechanism to improve the feature selection and thus preserve dependency over longer lengths in the recurrent neural network architectures. Further, to ensure better context development, we have used Mikolov's pre-trained word2vec word vectors in both the static and non-static mode. To check the efficacy of our proposed framework, we have made a comparison of our models with the state-of-the-art methods of Yoon Kim on seven benchmark datasets. The proposed framework achieves a state-of-the-art result on four of the seven datasets and a performance gain over the baseline model on five of the seven datasets. Furthermore, to check the effectivity of self-attention on the task of sentence classification, we compare our self-attention based framework with Bandanau's attention based implementation from our previous work.
引用
收藏
页码:905 / 911
页数:7
相关论文
共 36 条
[1]  
[Anonymous], 2004, Proceedings of the 42nd annual meeting on Association for Computational Linguistics, DOI DOI 10.3115/1218955.1218990
[2]  
[Anonymous], 2014, ARXIV14124314
[3]  
[Anonymous], 2016, NAACL
[4]  
[Anonymous], 2017, PROC INT C LEARN REP
[5]  
[Anonymous], 2014, C EMPIRICAL METHODS
[6]  
[Anonymous], GATED RECURRENT UNIT
[7]  
[Anonymous], 2016, ARXIV160901454
[8]  
[Anonymous], P 6 INT C INF ENG
[9]  
[Anonymous], 2015, P 2 INT C ADV INF CO
[10]   Recurrent neural networks for robust real-world text classffication [J].
Arevian, Garen .
PROCEEDINGS OF THE IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE: WI 2007, 2007, :326-329