Care more about customers: Unsupervised domain-independent aspect detection for sentiment analysis of customer reviews

被引:92
作者
Bagheri, Ayoub [1 ]
Saraee, Mohamad [2 ]
de Jong, Franciska [3 ]
机构
[1] Isfahan Univ Technol, Elect & Comp Engn Dept, Intelligent Database Data Min & Bioinformat Lab, Esfahan, Iran
[2] Univ Salford, Sch Comp Sci & Engn, Manchester, Lancs, England
[3] Erasmus Univ, Univ Twente, Human Media Interact Grp, NL-3000 DR Rotterdam, Netherlands
关键词
Aspect detection; Opinion mining; Review mining; Sentiment analysis; Implicit aspect; EXTRACTION; FEATURES; IRONY;
D O I
10.1016/j.knosys.2013.08.011
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the rapid growth of user-generated content on the internet, automatic sentiment analysis of online customer reviews has become a hot research topic recently, but due to variety and wide range of products and services being reviewed on the internet, the supervised and domain-specific models are often not practical. As the number of reviews expands, it is essential to develop an efficient sentiment analysis model that is capable of extracting product aspects and determining the sentiments for these aspects. In this paper, we propose a novel unsupervised and domain-independent model for detecting explicit and implicit aspects in reviews for sentiment analysis. In the model, first a generalized method is proposed to learn multi-word aspects and then a set of heuristic rules is employed to take into account the influence of an opinion word on detecting the aspect. Second a new metric based on mutual information and aspect frequency is proposed to score aspects with a new bootstrapping iterative algorithm. The presented bootstrapping algorithm works with an unsupervised seed set. Third, two pruning methods based on the relations between aspects in reviews are presented to remove incorrect aspects. Finally the model employs an approach which uses explicit aspects and opinion words to identify implicit aspects. Utilizing extracted polarity lexicon, the approach maps each opinion word in the lexicon to the set of pre-extracted explicit aspects with a co-occurrence metric. The proposed model was evaluated on a collection of English product review datasets. The model does not require any labeled training data and it can be easily applied to other languages or other domains such as movie reviews. Experimental results show considerable improvements of our model over conventional techniques including unsupervised and supervised approaches. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:201 / 213
页数:13
相关论文
共 35 条
  • [1] [Anonymous], 2008, P ACL 08 HLT ASS COM
  • [2] [Anonymous], 1993, COMPUT LINGUIST, DOI DOI 10.21236/ADA273556
  • [3] [Anonymous], 2012, Mining Text Data, DOI DOI 10.1007/978-1-4614-3223-413
  • [4] [Anonymous], 2005, Proceedings of the ACM international conference on world wide web
  • [5] [Anonymous], 2005, P C HUM LANG TECHN E, DOI DOI 10.3115/1220575.1220618
  • [6] [Anonymous], 2012, INT C WEB INF SYST E
  • [7] Detecting implicit expressions of emotion in text: A comparative analysis
    Balahur, Alexandra
    Hermida, Jesus M.
    Montoyo, Andres
    [J]. DECISION SUPPORT SYSTEMS, 2012, 53 (04) : 742 - 753
  • [8] Developing Corpora for Sentiment Analysis: The Case of Irony and Senti-TUT
    Bosco, Cristina
    Patti, Viviana
    Bolioli, Andrea
    [J]. IEEE INTELLIGENT SYSTEMS, 2013, 28 (02) : 55 - 63
  • [9] Brody S., 2010, P HUMAN LANGUAGE TEC, P804
  • [10] Chengxiang Zhai, 2001, Proceedings of the 2001 ACM CIKM. Tenth International Conference on Information and Knowledge Management, P403, DOI 10.1145/502585.502654