Self-Attention-Based BiLSTM Model for Short Text Fine-Grained Sentiment Classification

被引:57
作者
Xie, Jun [1 ]
Chen, Bo [1 ]
Gu, Xinglong [1 ]
Liang, Fengmei [1 ]
Xu, Xinying [2 ]
机构
[1] Taiyuan Univ Technol, Coll Informat & Comp, Jinzhong 030600, Peoples R China
[2] Taiyuan Univ Technol, Coll Elect & Power Engn, Taiyuan 030024, Peoples R China
基金
中国国家自然科学基金;
关键词
Aspect-term; bidirectional LSTM (BiLSTM); fine-grained; self-attention;
D O I
10.1109/ACCESS.2019.2957510
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Fine-grained sentiment polarity classification for short texts has been an important and challenging task in natural language processing until these years. The short texts may contain multiple aspect-terms, opinion terms expressing different sentiments for different aspect-terms. The polarity of the whole sentence is highly correlated with the aspect-terms and opinion terms. Besides, there are two challenges, which are how to effectively use the contextual information and the semantic features, and how to model the correlations between aspect-terms and context words including opinion terms. To solve these problems, a Self-Attention-Based BiLSTM model with aspect-term information is proposed for the fine-grained sentiment polarity classification for short texts. The proposed model can effectively use contextual information and semantic features, and especially model the correlations between aspect-terms and context words. The model mainly consists of a word-encode layer, a BiLSTM layer, a self-attention layer and a softmax layer. Among them, the BiLSTM layer sums up the information from two opposite directions of a sentence through two independent LSTMs. The self-attention layer captures the more important parts of a sentence when different aspect-terms are input. Between the BiLSTM layer and the self-attention layer, the hidden vector and the aspect-term vector are fused by adding, which reduces the computational complexity caused by the vector splicing directly. The experiments on public Restaurant and Laptop corpus from the SemEval 2014 Task 4, and Twitter corpus from the ACL 14. The Friedman and Nemenyi tests are used in the comparison study. Compared with existing methods, experimental results demonstrate that the proposed model is feasible and efficient.
引用
收藏
页码:180558 / 180570
页数:13
相关论文
共 36 条
[1]  
Abdelwahab O, 2015, IEEE INT SYMP SIGNAL, P46, DOI 10.1109/ISSPIT.2015.7394379
[2]  
[Anonymous], 2015, COMPUTER SCI
[3]  
[Anonymous], 2015, ARXIV150601057
[4]  
[Anonymous], Convolutional Neural Networks for Sentence Classification, DOI [https://doi.org/10.48550/arXiv.1408.5882, DOI 10.48550/ARXIV.1408.5882]
[5]  
[Anonymous], 2013, NEURAL INFORM PROCES
[6]  
[Anonymous], 2014, 3 INT C LEARN REPR
[7]  
[Anonymous], ARXIV160508900
[8]  
[Anonymous], 2014, ARXIV14066247
[9]  
Ansari A, 2016, PROCEEDINGS OF 2ND IEEE INTERNATIONAL CONFERENCE ON ENGINEERING & TECHNOLOGY ICETECH-2016, P758, DOI 10.1109/ICETECH.2016.7569350
[10]  
Arras L., 2017, WASSA EMNLP, P159, DOI 10.18653/v1/w17-5221