The Effects of Features Selection Methods on Spam Review Detection Performance

被引:10
|
作者
Etaiwi, Wael [1 ]
Awajan, Arafat [1 ]
机构
[1] Princess Sumaya Univ Technol, Amman, Jordan
来源
2017 INTERNATIONAL CONFERENCE ON NEW TRENDS IN COMPUTING SCIENCES (ICTCS) | 2017年
关键词
spam reviews; feature selection; machine learning; spam detection;
D O I
10.1109/ICTCS.2017.50
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Online reviews become a valuable source of information that indicates the overall opinion about products and services, which may affect decision-making processes such as purchase a product or service. Fake reviews are considered as spam reviews, which may have a great impact in the online marketplace behavior. Extracting useful features from review's text using Natural Language Processing (NLP) is not a straightforward step, in addition, it affects the overall performance and results. Many types of features could be used for conducting this task such as Bag-of-Words, linguistic features, words counts and n-gram feature. In this paper, we will investigate the effects of using two different feature selection methods on the spam reviews detection: Bag-of-Words and words counts. Different machine learning algorithms were applied such as Support Victor Machine, Decision Tree, Naive Bayes and Random Forest. Experiments were conducted on a labeled balanced dataset of Hotels reviews. The efficiency will be evaluated according to many evaluation measures such as: precision, recall and accuracy.
引用
收藏
页码:116 / 120
页数:5
相关论文
共 50 条
  • [41] Analysis on the Content Features and Their Correlation of Web Pages for Spam Detection
    JI Hua
    ZHANG Huaxiang
    中国通信, 2015, 12 (03) : 84 - 94
  • [42] Analysis on the Content Features and Their Correlation of Web Pages for Spam Detection
    Ji Hua
    Zhang Huaxiang
    CHINA COMMUNICATIONS, 2015, 12 (03) : 84 - 94
  • [43] Automated Spam Review Detection Using Hybrid Deep Learning on Arabic Opinions
    Alwayle I.M.
    Al-Onazi B.B.
    Nour M.K.
    Alalayah K.M.
    Alaidarous K.M.
    Ahmed I.A.
    Mehanna A.S.
    Motwakel A.
    Computer Systems Science and Engineering, 2023, 46 (03): : 2947 - 2961
  • [44] Mathematical Methods in Feature Selection: A Review
    Kamalov, Firuz
    Sulieman, Hana
    Alzaatreh, Ayman
    Emarly, Maher
    Chamlal, Hasna
    Safaraliev, Murodbek
    MATHEMATICS, 2025, 13 (06)
  • [45] A systematic literature review on spam content detection and classification
    Kaddoura, Sanaa
    Chandrasekaran, Ganesh
    Popescu, Daniela Elena
    Duraisamy, Jude Hemanth
    PEERJ COMPUTER SCIENCE, 2022, 8
  • [46] A systematic literature review on spam content detection and classification
    Kaddoura S.
    Chandrasekaran G.
    Popescu D.E.
    Duraisamy J.H.
    PeerJ Computer Science, 2022, 8
  • [47] Performance Evaluation of Classifiers for Spam Detection with Benchmark Datasets
    Bindu, V
    Thomas, Ciza
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON DATA MINING AND ADVANCED COMPUTING (SAPIENCE), 2016, : 17 - 22
  • [48] Binary PSO with mutation operator for feature selection using decision tree applied to spam detection
    Zhang, Yudong
    Wang, Shuihua
    Phillips, Preetha
    Ji, Genlin
    KNOWLEDGE-BASED SYSTEMS, 2014, 64 : 22 - 31
  • [49] Detection of Spam E-mails with Machine Learning Methods
    Karamollaoglu, Hamdullah
    Dogru, Ibrahim Alper
    Dorterler, Murat
    2018 INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS CONFERENCE (ASYU), 2018, : 55 - 59
  • [50] On the testing of network cyber threat detection methods on spam example
    Filasiak, Robert
    Grzenda, Maciej
    Luckner, Marcin
    Zawistowski, Pawel
    ANNALS OF TELECOMMUNICATIONS, 2014, 69 (7-8) : 363 - 377