Weakly supervised topic sentiment joint model with word embeddings

被引:31
|
作者
Fu, Xianghua [1 ]
Sun, Xudong [1 ]
Wu, Haiying [1 ]
Cui, Laizhong [1 ]
Huang, Joshua Zhexue [1 ]
机构
[1] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen, Peoples R China
关键词
Sentiment analysis; Topic model; Topic sentiment joint model; Word embeddings;
D O I
10.1016/j.knosys.2018.02.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Topic sentiment joint model aims to deal with the problem about the mixture of topics and sentiment simultaneously from online reviews. Most of existing topic sentiment modeling algorithms are mainly based on the state-of-art latent Dirichlet allocation (LDA) and probabilistic latent semantic analysis (PLSA), which infer sentiment and topic distributions from the co-occurrence of words. These methods have been proposed and successfully used for topic and sentiment analysis. However, when the training corpus is small or when the documents are short, the textual features become sparse, so that the results of the sentiment and topic distributions might be not very satisfied. In this paper, we propose a novel topic sentiment joint model called weakly supervised topic sentiment joint model with word embeddings (WS-TSWE), which incorporates word embeddings and HowNet lexicon simultaneously to improve the topic identification and sentiment recognition. The main contributions of WS-TSWE include the following two aspects. (1) Existing models generate the words only from the sentiment-topic-to-word Dirichlet multinomial component, but the WS-TSWE model replaces it with a mixture of two components, a Dirichlet multinomial component and a word embeddings component. Since the word embeddings are trained on a very large corpora and can be used to extend the semantic information of the words, they can provide a certain solution for the problem of the textual sparse. (2) Most of previous models incorporate sentiment knowledge in the beta priors. And the priors are usually set from a dictionary and completely rely on previous domain knowledge to identify positive and negative words. In contrast, the WS-TSWE model calculates the sentiment orientation of each word with the HowNet lexicon and automatically infers sentiment-based beta priors for sentiment analysis and opinion mining. Furthermore, we implement WS-TSWE with Gibbs sampling algorithms. The experimental results on Chinese and English data sets show that WS-TSWE achieved significant performance in the task of detecting sentiment and topics simultaneously. (c) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:43 / 54
页数:12
相关论文
共 50 条
  • [41] A Joint Model for Topic-Sentiment Evolution over Time
    Dermouche, Mohamed
    Velcin, Julien
    Khouas, Leila
    Loudcher, Sabine
    2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2014, : 773 - 778
  • [42] A Joint Model for Topic-Sentiment Modeling from Text
    Dermouche, Mohamed
    Kouas, Leila
    Velcin, Julien
    Loudcher, Sabine
    30TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, VOLS I AND II, 2015, : 819 - 824
  • [43] Sentiment Analysis using Topic-Document Embeddings
    Mitroi, Madalina
    Truica, Ciprian-Octavian
    Apostol, Elena-Simona
    Florea, Adina Magda
    2020 IEEE 16TH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTER COMMUNICATION AND PROCESSING (ICCP 2020), 2020, : 75 - 82
  • [44] Incorporating sentiment prior knowledge for weakly supervised sentiment analysis
    He, Yulan
    He, Y. (y.he@cantab.net), 2012, Association for Computing Machinery (11):
  • [45] Improving Sentiment Analysis in Twitter Using Sentiment Specific Word Embeddings
    Othman, Rania
    Abdelsadek, Youcef
    Chelghoum, Kamel
    Kacem, Imed
    Faiz, Rim
    PROCEEDINGS OF THE 2019 10TH IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT DATA ACQUISITION AND ADVANCED COMPUTING SYSTEMS - TECHNOLOGY AND APPLICATIONS (IDAACS), VOL. 2, 2019, : 854 - 858
  • [46] Sentiment analysis based on improved pre-trained word embeddings
    Rezaeinia, Seyed Mahdi
    Rahmani, Rouhollah
    Ghodsi, Ali
    Veisi, Hadi
    EXPERT SYSTEMS WITH APPLICATIONS, 2019, 117 : 139 - 147
  • [47] SenU-PTM: a novel phrase-based topic model for short-text topic discovery by exploiting word embeddings
    Lu, Heng-Yang
    Zhang, Yi
    Du, Yuntao
    DATA TECHNOLOGIES AND APPLICATIONS, 2021, 55 (05) : 643 - 660
  • [48] Word Embeddings-based Sentence-Level Sentiment Analysis considering Word Importance
    Hayashi, Toshitaka
    Fujita, Hamido
    ACTA POLYTECHNICA HUNGARICA, 2019, 16 (07) : 7 - 24
  • [49] Bias-Sentiment-Topic model for microblog sentiment analysis
    Guo, Juncai
    Chen, Xue
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2018, 30 (13)
  • [50] Multi-channel word embeddings for sentiment analysis
    Jhe-Wei Lin
    Tran Duy Thanh
    Rong-Guey Chang
    Soft Computing, 2022, 26 : 12703 - 12715