Fake reviews classification using deep learning ensemble of shallow convolutions

被引:17
作者
Javed, Muhammad Saad [1 ]
Majeed, Hammad [1 ]
Mujtaba, Hasan [1 ]
Beg, Mirza Omer [1 ]
机构
[1] Natl Univ Comp & Emerging Sci, AK Brohi Rd,Sect H-11-4, Islamabad, Pakistan
来源
JOURNAL OF COMPUTATIONAL SOCIAL SCIENCE | 2021年 / 4卷 / 02期
关键词
Deep learning in NLP; Fake reviews detection; Convolutional networks; Information retrieval; Ensemble models; Social behavior;
D O I
10.1007/s42001-021-00114-y
中图分类号
O1 [数学]; C [社会科学总论];
学科分类号
03 ; 0303 ; 0701 ; 070101 ;
摘要
Online reviews have a decisive impact on consumers' purchasing decisions. This opens the doors for spammers and scammers to post fake reviews for promoting non-existent products or undermine competitor products to affect social behavior. Thus, the identification of reviews as fake and real has become ever more important. Traditional approaches for text classification use a bag-of-words model to represent text which causes sparsity and word representations learnt from neural networks with limited ability to handle unknown words. In this paper, we propose a technique based on three different models trained on the idea of a multi-view learning technique and create an ensemble of all models by employing an aggregation technique for generating final predictions. The core idea of our methodology is to extract rich information from the text of reviews by combining bag-of-n-grams and parallel convolution neural networks(CNNs). By using an n-gram embedding layer with small kernel sizes we can use local context with the same computation power as required to train deep and complex CNNs. Our CNN-based architecture consumes n-gram embeddings as input and uses the parallel convolutional blocks to extract richer feature representations from text. Our approach for the detection of fake reviews also combines textual linguistic features and non-textual features related to reviewer behavior. We evaluate our approach on publically available Yelp Filtered Dataset and achieve F1 scores of up to 92% for classifying fake reviews.
引用
收藏
页码:883 / 902
页数:20
相关论文
共 34 条
  • [1] Anwar T, 2020, P 14 WORKSH SEM EV, P2177
  • [2] DeepDetect: Detection of Distributed Denial of Service Attacks Using Deep Learning
    Asad, Muhammad
    Asim, Muhammad
    Javed, Talha
    Beg, Mirza O.
    Mujtaba, Hasan
    Abbas, Sohail
    [J]. COMPUTER JOURNAL, 2020, 63 (07) : 983 - 994
  • [3] TOP-Rank: A TopicalPostionRank for Extraction and Classification of Keyphrases in Text
    Awan, Mubashar Nazar
    Beg, Mirza Omer
    [J]. COMPUTER SPEECH AND LANGUAGE, 2021, 65 (65)
  • [4] A neural probabilistic language model
    Bengio, Y
    Ducharme, R
    Vincent, P
    Jauvin, C
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (06) : 1137 - 1155
  • [5] Towards automatic filtering of fake reviews
    Cardoso, Emerson F.
    Silva, Renato M.
    Almeida, Tiago A.
    [J]. NEUROCOMPUTING, 2018, 309 : 106 - 116
  • [6] Semi-supervised Learning based Fake Review Detection
    Deng, Huaxun
    Zhao, Linfeng
    Luo, Ning
    Liu, Yuan
    Guo, Guibing
    Wang, Xingwei
    Tan, Zhenhua
    Wang, Shuang
    Zhou, Fucai
    [J]. 2017 15TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS AND 2017 16TH IEEE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING AND COMMUNICATIONS (ISPA/IUCC 2017), 2017, : 1278 - 1280
  • [7] Understanding Citizen Issues through Reviews: A Step towards Data Informed Planning in Smart Cities
    Dilawar, Noman
    Majeed, Hammad
    Beg, Mirza Omer
    Ejaz, Naveed
    Muhammad, Khan
    Mehmood, Irfan
    Nam, Yunyoung
    [J]. APPLIED SCIENCES-BASEL, 2018, 8 (09):
  • [8] Hovy D, 2016, PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2016), VOL 2, P351
  • [9] Javed AR, 2020, J AMB INTEL HUM COMP, DOI [10.1007/s10723-019-09498-8, 10.1007/s12652-020-01770-0]
  • [10] Jia SH, 2018, 2018 4TH INTERNATIONAL CONFERENCE ON INFORMATION MANAGEMENT (ICIM2018), P280, DOI 10.1109/INFOMAN.2018.8392850