Detecting Anomalous Online Reviewers: An Unsupervised Approach Using Mixture Models

被引:61
作者
Kumar, Naveen [1 ]
Venugopal, Deepak [2 ]
Qiu, Liangfei [3 ]
Kumar, Subodha [4 ,5 ]
机构
[1] Univ Washington, Management Informat Syst, Sch Business, Bothell, WA USA
[2] Univ Memphis, Dept Comp Sci, Memphis, TN 38152 USA
[3] Univ Florida, Dept Informat Syst & Operat Management, Warrington Coll Business, Gainesville, FL 32611 USA
[4] Temple Univ, Supply Chain Management Mkt Informat Syst & Stat, Philadelphia, PA 19122 USA
[5] Temple Univ, Ctr Data Analyt, Fox Sch Business, Philadelphia, PA 19122 USA
关键词
online reviews; fake reviews; opinion spam; unsupervised learning; anomaly detection; mixture models; deception detection; WORD-OF-MOUTH; DECEPTION; MANIPULATION; PLATFORMS; FAKE; STRATEGIES; SUPPORT; IMPACT; SALES; CHAT;
D O I
10.1080/07421222.2019.1661089
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Online reviews play a significant role in influencing decisions made by users in day-to-day life. The presence of reviewers who deliberately post fake reviews for financial or other gains, however, negatively impacts both users and businesses. Unfortunately, automatically detecting such reviewers is a challenging problem since fake reviews do not seem out-of-place next to genuine reviews. In this paper, we present a fully unsupervised approach to detect anomalous behavior in online reviewers. We propose a novel hierarchical approach for this task in which we (1) derive distributions for key features that define reviewer behavior, and (2) combine these distributions into a finite mixture model. Our approach is highly generalizable and it allows us to seamlessly combine both univariate and multivariate distributions into a unified anomaly detection system. Most importantly, it requires no explicit labeling (spam/not spam) of the data. Our newly developed approach outperforms prior state-of-the-art unsupervised anomaly detection approaches.
引用
收藏
页码:1313 / 1346
页数:34
相关论文
共 96 条
[1]  
Akoglu Leman., 2013, ICWSM, P2
[2]  
Amer M., 2013, P ACM SIGKDD WORKSH, P8
[3]  
[Anonymous], 2003, P WORKSH DAT MIN COM, DOI 10.7916/D85M6CFF
[4]  
[Anonymous], 2013, P INT AAAI C WEB SOC
[5]  
Banerjee S., 2014, Proceedings of the 8th International Conference on Ubiquitous Information Management and Communication, ICUIMC 2014, P1
[6]   Examining Hacker Participation Length in Cybercriminal Internet-Relay-Chat Communities [J].
Benjamin, Victor ;
Zhang, Bin ;
Nunamaker, Jay F., Jr. ;
Chen, Hsinchun .
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS, 2016, 33 (02) :482-510
[7]  
Bhattarai A, 2009, IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN CYBER SECURITY, P37
[8]  
Blanding M, 2011, YELP FACTOR ARE CONS
[9]   Seller Strategies for Differentiation in Highly Competitive Online Auction Markets [J].
Bockstedt, Jese ;
Goh, Kim Huat .
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS, 2011, 28 (03) :235-267
[10]  
Breiman L, 1996, MACH LEARN, V24, P123, DOI 10.1023/A:1018054314350