Automatic construction of domain-specific sentiment lexicon for unsupervised domain adaptation and sentiment classification

被引：25

作者：

Beigi, Omid Mohamad ^{[1
]}

Moattar, Mohammad H. ^{[1
]}

机构：

[1] Islamic Azad Univ, Comp Engn Dept, Mashhad Branch, Mashhad, Razavi Khorasan, Iran

来源：

KNOWLEDGE-BASED SYSTEMS | 2021年 / 213卷

关键词：

Sentiment analysis; Domain adaptation; Domain independent lexicon; Multilayer perceptron;

D O I：

10.1016/j.knosys.2020.106423

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Sentiment analysis has long been suffering from inaccuracies using either machine learning methods that mostly benefit from text features or sentiment lexicon-based methods that are prone to domain-dependent problems. Furthermore, since labeling is a time-consuming and an expensive task, supervised machine learning methods suffer from the drawback of insufficient labeled samples. To tackle the mentioned issues, this paper proposes a novel approach with a hybrid of a neural network and a sentiment lexicon. This combination can simultaneously adapt word polarities to the target domain and leverage the polarity of whole document in order to alleviate the need for large labeled corpora in an unsupervised manner. In this respect, a sentiment lexicon is constructed from the source domain in the preprocessing phase using the labeled data. In the Next phase, having a Multilayer Perceptron (MLP), the weights of the first hidden layer are set to the corresponding polarity of each word from the retrieved sentiment lexicon and the network is trained. Finally, a Domain-Independent Lexicon (DIL) is introduced which contains words (mostly adjectives) with static positive or negative scores independent from a specific domain. After feeding the target domain to the pre-trained model, the total accuracy of the framework is enhanced by estimating the sentiment polarity of each sentence using the summation of the scores of the constitutive domain independent words. The experiments on Amazon multi-domain sentiment dataset illustrate that our approach significantly outperforms several alternative previous approaches of unsupervised domain adaptation. (C) 2020 Published by Elsevier B.V.

引用

页数：12

共 61 条

[1]

[Anonymous], 2006, P 2006 C EMP METH NA

[2]

[Anonymous], 2009, P INT AAAI C WEB SOC

[3]

[Anonymous], 2017, INFORM FUSION, DOI DOI 10.1016/j.inffus.2016.09.001

[4] A theory of learning from different domains [J].

Ben-David, Shai ;

Blitzer, John ;

Crammer, Koby ;

Kulesza, Alex ;

Pereira, Fernando ;

Vaughan, Jennifer Wortman .

MACHINE LEARNING, 2010, 79 (1-2) :151-175

[5]

Bengio Y, 2001, ADV NEUR IN, V13, P932

[6]

Blitzer J., 2007, ANN M ASS COMP LING, P440

[7]

Chen B, 2010, AAAI CONF ARTIF INTE, P1007

[8] Gated recurrent neural network with sentimental relations for sentiment classification [J].

Chen, Chaotao ;

Zhuo, Run ;

Ren, Jiangtao .

INFORMATION SCIENCES, 2019, 502 :268-278

[9] Data-driven integration of multiple sentiment dictionaries for lexicon-based sentiment classification of product reviews [J].

Cho, Heeryon ;

Kim, Songkuk ;

Lee, Jongseo ;

Lee, Jong-Seok .

KNOWLEDGE-BASED SYSTEMS, 2014, 71 :61-71

[10]

Choi Yejin, 2009, Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

← 1 2 3 4 5 6 7 →