The Effects of Features Selection Methods on Spam Review Detection Performance

被引:10
|
作者
Etaiwi, Wael [1 ]
Awajan, Arafat [1 ]
机构
[1] Princess Sumaya Univ Technol, Amman, Jordan
来源
2017 INTERNATIONAL CONFERENCE ON NEW TRENDS IN COMPUTING SCIENCES (ICTCS) | 2017年
关键词
spam reviews; feature selection; machine learning; spam detection;
D O I
10.1109/ICTCS.2017.50
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Online reviews become a valuable source of information that indicates the overall opinion about products and services, which may affect decision-making processes such as purchase a product or service. Fake reviews are considered as spam reviews, which may have a great impact in the online marketplace behavior. Extracting useful features from review's text using Natural Language Processing (NLP) is not a straightforward step, in addition, it affects the overall performance and results. Many types of features could be used for conducting this task such as Bag-of-Words, linguistic features, words counts and n-gram feature. In this paper, we will investigate the effects of using two different feature selection methods on the spam reviews detection: Bag-of-Words and words counts. Different machine learning algorithms were applied such as Support Victor Machine, Decision Tree, Naive Bayes and Random Forest. Experiments were conducted on a labeled balanced dataset of Hotels reviews. The efficiency will be evaluated according to many evaluation measures such as: precision, recall and accuracy.
引用
收藏
页码:116 / 120
页数:5
相关论文
共 50 条
  • [21] Graph regularization methods for Web spam detection
    Jacob Abernethy
    Olivier Chapelle
    Carlos Castillo
    Machine Learning, 2010, 81 : 207 - 225
  • [22] Graph regularization methods for Web spam detection
    Abernethy, Jacob
    Chapelle, Olivier
    Castillo, Carlos
    MACHINE LEARNING, 2010, 81 (02) : 207 - 225
  • [23] Spam review detection using LSTM autoencoder: an unsupervised approach
    Saumya, Sunil
    Singh, Jyoti Prakash
    ELECTRONIC COMMERCE RESEARCH, 2022, 22 (01) : 113 - 133
  • [24] Statistical Twitter Spam Detection Demystified: Performance, Stability and Scalability
    Lin, Guanjun
    Sun, Nan
    Nepal, Surya
    Zhang, Jun
    Xiang, Yang
    Hassan, Houcine
    IEEE ACCESS, 2017, 5 : 11142 - 11154
  • [25] Spam review detection using LSTM autoencoder: an unsupervised approach
    Sunil Saumya
    Jyoti Prakash Singh
    Electronic Commerce Research, 2022, 22 : 113 - 133
  • [26] Spam Review Detection Techniques: A Systematic Literature Review
    Hussain, Naveed
    Mirza, Hamid Turab
    Rasool, Ghulam
    Hussain, Ibrar
    Kaleem, Mohammad
    APPLIED SCIENCES-BASEL, 2019, 9 (05):
  • [27] ENWalk: Learning Network Features for Spam Detection in Twitter
    Santosh, K. C.
    Maity, Suman Kalyan
    Mukherjee, Arjun
    SOCIAL, CULTURAL, AND BEHAVIORAL MODELING, 2017, 10354 : 90 - 101
  • [28] EMAIL SPAM DETECTION: A SYMBIOTIC FEATURE SELECTION APPROACH FOSTERED BY EVOLUTIONARY COMPUTATION
    Sousa, Pedro
    Cortez, Paulo
    Vaz, Rui
    Rocha, Miguel
    Rio, Miguel
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & DECISION MAKING, 2013, 12 (04) : 863 - 884
  • [29] Cost-Sensitive Spam Detection Using Parameters Optimization and Feature Selection
    Lee, Sang Min
    Kim, Dong Seong
    Park, Jong Sou
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2011, 17 (06) : 944 - 960
  • [30] An effective feature selection method for web spam detection
    Asdaghi, Faeze
    Soleimani, Ali
    KNOWLEDGE-BASED SYSTEMS, 2019, 166 : 198 - 206