Phonetic-enriched text representation for Chinese sentiment analysis with reinforcement learning

被引:38
作者
Peng, Haiyun [1 ]
Ma, Yukun [1 ]
Poria, Soujanya [2 ]
Li, Yang [3 ]
Cambria, Erik [4 ]
机构
[1] Alibaba Grp, Singapore, Singapore
[2] Singapore Univ Technol & Design, Singapore, Singapore
[3] Northwestern Polytech Univ, Xian, Peoples R China
[4] Nanyang Technol Univ, Singapore, Singapore
关键词
Sentiment analysis; Multilingual sentiment analysis; Chinese phonetics; Deep phonemic orthography; NETWORK;
D O I
10.1016/j.inffus.2021.01.005
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Chinese pronunciation system offers two characteristics that distinguish it from other languages: deep phonemic orthography and intonation variations. In this paper, we hypothesize that these two important properties can play a major role in Chinese sentiment analysis. In particular, we propose two effective features to encode phonetic information and, hence, fuse it with textual information. With this hypothesis, we propose Disambiguate Intonation for Sentiment Analysis (DISA), a network that we develop based on the principles of reinforcement learning. DISA disambiguates intonations for each Chinese character (pinyin) and, hence, learns precise phonetic representations. We also fuse phonetic features with textual and visual features to further improve performance. Experimental results on five different Chinese sentiment analysis datasets show that the inclusion of phonetic features significantly and consistently improves the performance of textual and visual representations and surpasses the state-of-the-art Chinese character-level representations.
引用
收藏
页码:88 / 99
页数:12
相关论文
共 70 条
[1]   How Intense Are You? Predicting Intensities of Emotions and Sentiments using Stacked Ensemble [J].
Akhtar, Md Shad ;
Ekbal, Asif ;
Cambria, Erik .
IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2020, 15 (01) :64-75
[2]  
Albrow K.H., 1972, The English writing system: notes towards description
[3]   Bridging Cognitive Models and Recommender Systems [J].
Angulo, Cecilio ;
Falomir, Ing. Zoe ;
Anguita, Davide ;
Agell, Nuria ;
Cambria, Erik .
COGNITIVE COMPUTATION, 2020, 12 (02) :426-427
[4]  
[Anonymous], 2014, C EMPIRICAL METHODS
[5]  
[Anonymous], 2016, P 2016 C EMP METH NA
[6]  
[Anonymous], 2013, arXiv preprint arXiv:1301.3781
[7]   A neural probabilistic language model [J].
Bengio, Y ;
Ducharme, R ;
Vincent, P ;
Jauvin, C .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (06) :1137-1155
[8]  
Benjamin A, 1997, HIST PROSPECT CHINES
[9]   The four dimensions of social network analysis: An overview of research methods, applications, and software tools [J].
Camacho, David ;
Panizo-LLedot, Angel ;
Bello-Orgaz, Gema ;
Gonzalez-Pardo, Antonio ;
Cambria, Erik .
INFORMATION FUSION, 2020, 63 :88-120
[10]   SenticNet 6: Ensemble Application of Symbolic and Subsymbolic AI for Sentiment Analysis [J].
Cambria, Erik ;
Li, Yang ;
Xing, Frank Z. ;
Poria, Soujanya ;
Kwok, Kenneth .
CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, :105-114