A CNN-BiLSTM Model for Document-Level Sentiment Analysis

被引:153
作者
Rhanoui, Maryem [1 ,2 ]
Mikram, Mounia [2 ,3 ]
Yousfi, Siham [2 ,4 ]
Barzali, Soukaina [2 ]
机构
[1] Mohammed V Univ Rabat, IMS Team, ADMIR Lab, Rabat IT Ctr,ENSIAS, Rabat 10100, Morocco
[2] LYRICA Lab, Sch Informat Sci, Meridian Team, Rabat 10100, Morocco
[3] Mohammed V Univ, Fac Sci, LRIT Lab, Rabat IT Ctr,Associated Unit CNRST URAC 29, Rabat 10100, Morocco
[4] Mohammed V Univ Rabat, EMI, Rabat IT Ctr, SIP Res Team, Rabat 10100, Morocco
关键词
sentiment analysis; document level; Doc2vec; CNN-BiLSTM;
D O I
10.3390/make1030048
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Document-level sentiment analysis is a challenging task given the large size of the text, which leads to an abundance of words and opinions, at times contradictory, in the same document. This analysis is particularly useful in analyzing press articles and blog posts about a particular product or company, and it requires a high concentration, especially when the topic being discussed is sensitive. Nevertheless, most existing models and techniques are designed to process short text from social networks and collaborative platforms. In this paper, we propose a combination of Convolutional Neural Networks (CNN) and Bidirectional Long Short-Term Memory (BiLSTM) models, with Doc2vec embedding, suitable for opinion analysis in long texts. The CNN-BiLSTM model is compared with CNN, LSTM, BiLSTM and CNN-LSTM models with Word2vec/Doc2vec embeddings. The Doc2vec with CNN-BiLSTM model was applied on French newspapers articles and outperformed the other models with 90.66% accuracy.
引用
收藏
页码:832 / 847
页数:16
相关论文
共 51 条
[1]  
[Anonymous], ARXIV161106639
[2]  
Bo Pang, 2008, Foundations and Trends in Information Retrieval, V2, P1, DOI 10.1561/1500000001
[3]  
Bojanowski Piotr, 2017, Trans. Assoc. Comput. Linguist., V5, P135, DOI DOI 10.1162/TACL_A_00051
[4]   Large-Scale Machine Learning with Stochastic Gradient Descent [J].
Bottou, Leon .
COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, :177-186
[5]  
Dong L, 2014, PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, P49
[6]  
Dos Santos C., 2014, COLING 2014 25 INT C, P69
[7]  
Durant K.T., 2006, 12 ACM SIGKDD INT C
[8]   Bag of meta-words: A novel method to represent document for the sentiment classification [J].
Fu, Mingsheng ;
Qu, Hong ;
Huang, Li ;
Lu, Li .
EXPERT SYSTEMS WITH APPLICATIONS, 2018, 113 :33-43
[9]   Analysis of Machine Learning Algorithms for Opinion Mining in Different Domains [J].
Gamal, Donia ;
Alfonse, Marco ;
El-Horbaty, El-Sayed M. ;
Salem, Abdel-Badeeh M. .
MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2019, 1 (01) :224-234
[10]  
Gers F., 2001, LONG SHORT TERM MEMO