Detection model of effectiveness of Chinese online reviews based on logistic regression

被引:3
|
作者
Wu, Hanqian [1 ]
Zhu, Yunjie [1 ]
Xie, Jue [2 ]
机构
[1] School of Computer Science and Engineering, Southeast University, Nanjing
[2] Southeast University-Monash University Joint Graduate School, Suzhou
来源
Dongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Southeast University (Natural Science Edition) | 2015年 / 45卷 / 03期
关键词
Association rule; Effectiveness of online review; Logistic regression;
D O I
10.3969/j.issn.1001-0505.2015.03.004
中图分类号
学科分类号
摘要
In order to realize automated detection of the effectiveness of Chinese online reviews in the context of e-commerce and social networks, a spam detection model based on logistic regression to solve single topic classification problem is proposed. The detection of effectiveness of Chinese online reviews can be regarded as a classification problem. According to the characteristics of Chinese online reviews, nine features are extracted to build the classification model. In order to extract the core feature-topic relevance, an association rule based review term mode is utilized to optimize the topics identification in ICTCLAS(Institute of Computing Technology, Chinese Lexical Analysis System). The cross language model is then used to retrieve relevancy between online review topics. In the experiment, a sample of 1 000 human-labeled reviews is used, and the support vector machine (SVM) classification model is adopted as a comparison. The calculation results of the data mining tool Weka demonstrate that the accuracy rate of the proposed logistic regression classification model based on the optimized review term classification mode is 83.54%, which is 2.10% higher than that of the SVM classification model. ©, 2015, Southeast University. All right reserved.
引用
收藏
页码:433 / 437
页数:4
相关论文
共 14 条
  • [1] Karkare V.Y., Gupta S.R., A survey on product evaluation using opinion mining, International Journal of Computer Science and Applications, 6, 2, pp. 306-312, (2013)
  • [2] Sheibani A.A., Opinion mining and opinion spam: a literature review focusing on product reviews, 2012 Sixth International Symposium on Telecommunications (IST), pp. 1109-1113, (2012)
  • [3] Lim E.P., Nguyen V.A., Jindal N., Et al., Detecting product review spammers using rating behaviors, Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 939-948, (2010)
  • [4] Jindal N., Liu B., Lim E.P., Finding unusual review patterns using unexpected rules, Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 1549-1552, (2010)
  • [5] Mukherjee A., Kumar A., Liu B., Et al., Spotting opinion spammers using behavioral footprints, Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 632-640, (2013)
  • [6] Jindal N., Liu B., Opinion spam and analysis, Proceedings of the 2008 International Conference on Web Search and Data Mining, pp. 219-230, (2008)
  • [7] Ott M., Cardie C., Hancock J.T., Negative deceptive opinion spam, North American Chapter of the Association for Computational Linguistics-Human Language Technologies, pp. 497-501, (2013)
  • [8] Lin Y., Zhu T., Wang X., Et al., Towards online review spam detection, Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web Companion, pp. 341-342, (2014)
  • [9] Liu B., Sentiment analysis and opinion mining, Synthesis Lectures on Human Language Technologies, 5, 1, pp. 1-167, (2012)
  • [10] Xu L., Lin H., Pan Y., Et al., Constructing the affective lexicon ontology, Journal of the China Society for Scientific and Technical Information, 27, 2, pp. 180-185, (2008)