Review Spam Detection using Active Learning

被引:0
作者
Ahsan, M. N. Istiaq [1 ]
Nahian, Tamzid [1 ]
Kafi, Abdullah All [1 ]
Hossain, Ismail [1 ]
Shah, Faisal Muhammad [1 ]
机构
[1] Ahsanullah Univ Sci & Technol, Dept Comp Sci & Engn, Dhaka, Bangladesh
来源
7TH IEEE ANNUAL INFORMATION TECHNOLOGY, ELECTRONICS & MOBILE COMMUNICATION CONFERENCE IEEE IEMCON-2016 | 2016年
关键词
Review spam detection; Opinion Mining; Fake Review; Spam Review; Spam Detection;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
As the access to Internet has been so much easier in the last decade or so, people are using online applications more than ever. Online marketing, in fact, the whole e-commerce is getting enormous day by day if not in every minute. Online Reviews play a very important role in this field and proving itself to be auspicious in terms of decision making from a customer's point of view. Even though these are very sensitive and significant information, ensuring the authenticity of user-generated content (Reviews, forums, blogs, discussion groups etc.) is erratically visible. That is why spamming, fake reviews and fabricated opinions are on the rise. Materially, it has become a profitable business which hampers the ingenuousness of the real fact. Several techniques have been introduced regarding this problem which depend mostly upon empirical conditions, rating consistency, obvious content features, and helpfulness voting etc. which confines the effectiveness of this undertaking. Most of the existing researches are supervised models whereas, good quality large-scale datasets are still very scarce and most of the models use pseudo fake reviews instead of real fake reviews. In this research, we introduce active learning approach to detect review spam using the TF-IDF features of the review content. Our model achieves phenomenal improvements in performance measures, working on almost 3600 reviews from different domains. In the best case, it achieves up to 88% accuracy and precision, recall and f-scores are above 85% in most cases. Additionally, about 2000 reviews were manually labeled during the process. Finally, after evaluating results, it indicates that this is a promising methodology for detecting review spams.
引用
收藏
页数:7
相关论文
共 33 条
[1]  
Algur S., 2010, P 2 INT C IT BUS INT
[2]  
[Anonymous], 2009, PATTERN RECOGNITION
[3]  
[Anonymous], Active learning literature survey
[4]  
[Anonymous], 2012, P 18 ACM SIGKDD INT, DOI DOI 10.1145/2339530.2339662
[5]  
[Anonymous], 2013, Technical Report
[6]  
[Anonymous], 1998, P 11 ANN C COMP LEAR
[7]  
Berry MichaelJ., 1997, DATA MINING TECHNIQU
[8]  
Chapelle Olivier, 2009, IEEE Transactions on Neural Networks, V20, P542
[9]  
Chin Si-Chi., 2010, Pro- ceedings of the 4th workshop on Information credibility, P3
[10]   Survey of review spam detection using machine learning techniques [J].
Crawford M. ;
Khoshgoftaar T.M. ;
Prusa J.D. ;
Richter A.N. ;
Al Najada H. .
Journal of Big Data, 2015, 2 (01)