Empirical study on imbalanced learning of Arabic sentiment polarity with neural word embedding

被引:7
作者
El-Alfy, El-Sayed M. [1 ]
Al-Azani, Sadam [1 ]
机构
[1] King Fahd Univ Petr & Minerals, Informat & Comp Sci Dept, Dhahran, Saudi Arabia
关键词
Social network; sentiment analysis; polarity detection; word embedding; machine learning; imbalanced dataset; Arabic tweets; CLASSIFICATION; SMOTE;
D O I
10.3233/JIFS-179703
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the proliferation of social media and mobile technology, huge amount of unstructured data is posted daily online. Consequently, sentiment analysis has gained increasing importance as a tool to understand the opinions of certain groups of people on contemporary political, cultural, social or commercial issues. Unlike western languages, the research on sentiment analysis for dialectical Arabic language is still in its early stages with several challenges to be addressed. The main goal of this study is twofold. First, it compares the performance of core machine learning algorithms for detecting the polarity in imbalanced Arabic tweet datasets using neural word embedding as a feature extractor rather than hand-crafted or traditional features. Second, it examines the impact of using various oversampling techniques to handle the highly-imbalanced nature of the sentiment data. Intensive empirical analysis of nine machine learning methods and six oversampling methods has been conducted and the results have been discussed in terms of a wide range of performance measures.
引用
收藏
页码:6211 / 6222
页数:12
相关论文
共 46 条
[1]  
Al Shboul B, 2015, 2015 6TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), P206, DOI 10.1109/IACS.2015.7103228
[2]   Hybrid Deep Learning for Sentiment Polarity Determination of Arabic Microblogs [J].
Al-Azani, Sadam ;
El-Alfy, El-Sayed M. .
NEURAL INFORMATION PROCESSING (ICONIP 2017), PT II, 2017, 10635 :491-500
[3]   Polarity Classification of Arabic Sentiments [J].
Al-Kabi, Mohammed N. ;
Wahsheh, Heider A. ;
Alsmadi, Izzat M. .
INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY AND WEB ENGINEERING, 2016, 11 (03) :32-49
[4]  
Alm Cecilia Ovesdotter, 2005, P HUMAN LANGUAGE TEC
[5]  
Altowayan A., 2016, IEEE INT C BIG DAT
[6]  
[Anonymous], 2013, P WORKSH INT C LEARN
[7]   The Impact of Sentiment Features on the Sentiment Polarity Classification in Persian Reviews [J].
Asgarian, Ehsan ;
Kahani, Mohsen ;
Sharifi, Shahla .
COGNITIVE COMPUTATION, 2018, 10 (01) :117-135
[8]  
Batista G.E., 2004, ACM SIGKDD EXPL NEWS, V6, P20, DOI DOI 10.1145/1007730.1007735
[9]  
Brahimi B., 2016, J. Digit. Inf. Manag., V14, P15
[10]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32