Towards an efficient opinion measurement in Arabic comments

被引:18
作者
Cherif, Walid [1 ]
Madani, Abdellah [2 ]
Kissi, Mohamed [1 ]
机构
[1] Fac Sci, Dept Comp Sci, Lab LIMA, BP 20, El Jadida 24000, Morocco
[2] Fac Sci, Dept Comp Sci, Lab LAROSERI, El Jadida 24000, Morocco
来源
INTERNATIONAL CONFERENCE ON ADVANCED WIRELESS INFORMATION AND COMMUNICATION TECHNOLOGIES (AWICT 2015) | 2015年 / 73卷
关键词
Automatic language processing; Arabic text; Opinion mining; Low-level light-stemming; Support vector machines; K-nearest neighbors; Similarity measure; SUPPORT VECTOR MACHINES;
D O I
10.1016/j.procs.2015.12.057
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
Arabic language is the fifth most widely used language on Internet(1). Every day, a huge volume of Arabic comments and reviews have been generated concerning different aspects of our life. In the light of the scarcity of systems to analyze this data, we propose in this paper a complete approach in order to identify and classify author's opinions. It is conducted using a dataset consisting of 625 Arabic reviews and comments collected from Trip Advisor website which fall into five classes. We started first by choosing the appropriate stemming algorithm, and used it to introduce our new mathematical approach to formulate opinions. The classification which based on Support Vector Machines derived a scheme, which, in turn, often needed to be refined as some reviews remained unclassified. We opted then for a new similarity approach based on k-nearest neighbors and feature weighting to classify them. Finally, we compared our global approach with two recent works in terms of accuracy. The results obtained have met our expectations. (C) 2015 The Authors. Published by Elsevier B.V.
引用
收藏
页码:122 / 129
页数:8
相关论文
共 33 条
[1]   A novel root based Arabic stemmer [J].
Al-Kabi, Mohammed N. ;
Kazakzeh, Saif A. ;
Abu Ata, Belal M. ;
Al-Rababah, Saif A. ;
Alsmadi, Izzat M. .
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2015, 27 (02) :94-103
[2]  
Al-Maimani M. R., 2011, 2011 IEEE GCC Conference and Exhibition (GCC), P541, DOI 10.1109/IEEEGCC.2011.5752576
[3]  
[Anonymous], 2014, BUILDING SYNTACTIC R
[4]  
[Anonymous], 2000, NATURE STAT LEARNING, DOI DOI 10.1007/978-1-4757-3264-1
[5]  
[Anonymous], 2014, INT J BIG DATA INTEL, DOI DOI 10.1504/IJBDI.2014.063845
[6]   Improving opinion retrieval in social media by combining features-based coreferencing and memory-based learning [J].
Atkinson, John ;
Salas, Gonzalo ;
Figueroa, Alejandro .
INFORMATION SCIENCES, 2015, 299 :20-31
[7]  
Atwan J., 2013, Communication in Computer and Information Science, V378, P219
[8]  
Badaro G., 2014, A large scale Arabic sentiment lexicon for Arabic opinion mining, P165
[9]  
Cherif W, 2014, INT CONF MULTIMED, P1077, DOI 10.1109/ICMCS.2014.6911275
[10]   Structure musk odour relationship studies of tetralin and indan compounds using neural networks [J].
Cherqaoui, D ;
Esseffar, M ;
Villemin, D ;
Cense, JM ;
Chastrette, M ;
Zakarya, D .
NEW JOURNAL OF CHEMISTRY, 1998, 22 (08) :839-843