GSLDA: LDA-based group spamming detection in product reviews

被引:51
作者
Wang, Zhuo [1 ]
Gu, Songmin [1 ]
Xu, Xiaowei [2 ]
机构
[1] Shenyang Ligong Univ, Shenyang, Liaoning, Peoples R China
[2] Univ Arkansas, Little Rock, AR 72204 USA
关键词
Review spam; Group spamming; LDA; Opinion spamming;
D O I
10.1007/s10489-018-1142-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Online product reviews are becoming increasingly important due to their guidance function in people's purchase decisions. As being highly subjective, online reviews are subject to opinion spamming, i.e., fraudsters write fake reviews or give unfair ratings to promote or demote target products. Although there have been much efforts in this field, the problem is still left open due to the difficulties in gathering ground-truth data. As more and more people are using Internet in everyday life, group review spamming, which involves a group of fraudsters writing hype-reviews (promote) or defaming-reviews (demote) for one or more target products, becomes the main form of review spamming. In this paper, we propose a LDA-based computing framework, namely GSLDA, for group spamming detection in product review data. As a completely unsupervised approach, GSLDA works in two phases. It first adapts LDA (Latent Dirichlet Allocation) to the product review context in order to bound the closely related group spammers into a small-sized reviewer cluster, and then it extracts high suspicious reviewer groups from each LDA-clusters. Experiments on three real-world datasets show that GSLDA can detect high quality spammer groups, outperforming many state-of-the-art baselines in terms of accuracy.
引用
收藏
页码:3094 / 3107
页数:14
相关论文
共 25 条
[1]  
Akoglu L., 2013, P 7 INT C WEBL SOC M, P1
[2]  
Allahbakhsh Mohammad, 2013, Web Technologies and Applications. 15th Asia-Pacific Web Conference, APWeb 2013. Proceedings, P196, DOI 10.1007/978-3-642-37401-2_21
[3]  
[Anonymous], 2011, Proceedings of the 2011 IEEE 11th International Conference on Data Mining, DOI DOI 10.1109/ICDM.2011.124
[4]  
[Anonymous], 2011, 49 ANN M ASS COMP LI, DOI DOI 10.1145/2567948.2577293
[5]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[6]   Detecting Opinion Spammer Groups Through Community Discovery and Sentiment Analysis [J].
Choo, Euijin ;
Yu, Ting ;
Chi, Min .
DATA AND APPLICATIONS SECURITY AND PRIVACY XXIX, 2015, 9149 :170-187
[7]  
Crawford M., 2015, J BIG DATA, DOI [10.1186/s40537-015-0029-9, DOI 10.1186/S40537-015-0029-9]
[8]  
Fei G., 2013, ICWSM
[9]  
Jindal N., 2008, Proceedings of the 2008 International Conference on Web Search and Data Mining: ACM, DOI DOI 10.1145/1341531.1341560
[10]  
Lee KD, 2016, P 6 INT C WEB INT MI, P9