Comparison of feature-level learning methods for mining online consumer reviews

被引:67
作者
Chen, Li [1 ]
Qi, Luole [1 ]
Wang, Feng [1 ]
机构
[1] Hong Kong Baptist Univ, Dept Comp Sci, Hong Kong, Hong Kong, Peoples R China
关键词
Consumer reviews; E-commerce; Feature-level opinion mining; Conditional Random Fields (CRFs); Lexicalized Hidden Markov Model (L-HMMs); Association rule mining;
D O I
10.1016/j.eswa.2012.02.158
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The tasks of feature-level opinion mining usually include the extraction of product entities from consumer reviews, the identification of opinion words that are associated with the entities, and the determining of these opinions' polarities (e.g., positive, negative, or neutral). In recent years, two major approaches have been proposed to determine opinions at the feature level: model based methods such as the one based on lexicalized Hidden Markov Model (L-HMMs), and statistical methods like the association rule mining based technique. However, little work has compared these algorithms regarding their practical abilities in identifying various types of review elements, such as features, opinions, intensifiers, entity phrases and infrequent entities. On the other hand, little attentions has been paid to applying more discriminative learning models to accomplish these opinion mining tasks. In this paper, we not only experimentally compared these methods based on a real-world review dataset, but also in particular adopted the Conditional Random Fields (CRFs) model and evaluated its performance in comparison with related algorithms. Moreover, for CRFs-based mining algorithm, we tested the role of a self-tagging process in two automatic training conditions, and further identified the ideal combination of learning functions to optimize its learning performance. The comparative experiment eventually revealed the CRFs-based method's outperforming accuracy in terms of mining multiple review elements, relative to other methods. (C) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:9588 / 9601
页数:14
相关论文
共 32 条
  • [11] Information extraction
    Cowie, J
    Lehnert, W
    [J]. COMMUNICATIONS OF THE ACM, 1996, 39 (01) : 80 - 91
  • [12] Yahoo! for Amazon: Sentiment extraction from small talk on the web
    Das, Sanjiv R.
    Chen, Mike Y.
    [J]. MANAGEMENT SCIENCE, 2007, 53 (09) : 1375 - 1388
  • [13] Ding X., 2008, P 2008 INT C WEB SEA, P231, DOI [10.1145/1341531.1341561, DOI 10.1145/1341531.1341561]
  • [14] Feiguina O, 2007, LECT NOTES COMPUT SC, V4509, P452
  • [15] GOLDBERG A.B., 2006, P HLT NAACL WORKSHOP, P45
  • [16] Hatzivassiloglou Vasileios., 2000, P INT C COMPUTATIONA, P299, DOI DOI 10.3115/990820.990864
  • [17] Hu M., 2004, P TENTHACM SIGKDD IN, P168
  • [18] Jakob N., 2010, Proceedings of the 2010 conference on empirical methods in natural language processing, P1035
  • [19] Jin W, 2009, KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P1195
  • [20] Li Fangtao, 2010, P 23 INT C COMPUTATI, P653