A Large-Scale Sentiment Data Classification for Online Reviews Under Apache Spark

被引:13
作者
Al-Saqqa, Samar [1 ,2 ]
Al-Naymat, Ghazi [1 ]
Awajan, Arafat [1 ]
机构
[1] Princess Sumaya Univ Technol, Amman, Jordan
[2] Univ Jordan, Amman, Jordan
来源
9TH INTERNATIONAL CONFERENCE ON EMERGING UBIQUITOUS SYSTEMS AND PERVASIVE NETWORKS (EUSPN-2018) / 8TH INTERNATIONAL CONFERENCE ON CURRENT AND FUTURE TRENDS OF INFORMATION AND COMMUNICATION TECHNOLOGIES IN HEALTHCARE (ICTH-2018) | 2018年 / 141卷
关键词
Big Data; Apache spark; Sentiment; MLlib; Machine Learning; BIG DATA; ALGORITHMS;
D O I
10.1016/j.procs.2018.10.166
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Sentiment Analysis of large-scale data has become increasingly important and has attracted many researchers, urging them to use new platforms and tools that can handle large volumes of data. In this paper, we present new evaluation experiments of sentiment analysis for a large-scale dataset of online customer's reviews under Apache Spark data Processing System. Apache Spark's scalable machine learning library (MLlib) is used and three classification techniques from the library are applied; Naive Bayes, Support vector machine, and logistic regression. The results are evaluated using the accuracy metric. Experimental results show that Support vector machine classifier outperforms Naive Bayes and logistic regression classifiers. (C) 2018 The Authors. Published by Elsevier Ltd.
引用
收藏
页码:183 / 189
页数:7
相关论文
共 23 条
[1]  
Adib P, 2017, PROCEEDINGS OF THE 2017 7TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), P283, DOI 10.1109/ICCKE.2017.8167892
[2]  
[Anonymous], 2010, HOTCLOUD
[3]  
[Anonymous], 2016, The Journal of Machine Learning Research, DOI DOI 10.1145/2882903.2912565
[4]  
[Anonymous], 2012, International Journal of Computer Science Issues
[5]  
[Anonymous], WIMS
[6]  
Baltas A., 2016, International Workshop of Algorithmic Aspects of Cloud Computing, P15
[7]  
Banic L, 2013, 2013 36TH INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), P1149
[8]  
Bhatt A., 2015, INT J COMPUTER SCI I, V6, P5107
[9]   Big Data: A Survey [J].
Chen, Min ;
Mao, Shiwen ;
Liu, Yunhao .
MOBILE NETWORKS & APPLICATIONS, 2014, 19 (02) :171-209
[10]   Arabic Sentiment Analysis using Supervised Classification [J].
Duwairi, Rehab M. ;
Qarqaz, Islam .
2014 INTERNATIONAL CONFERENCE ON FUTURE INTERNET OF THINGS AND CLOUD (FICLOUD), 2014, :579-583