Using reinforcement learning with external rewards for open-domain natural language generation

被引:0
作者
Vidhushini Srinivasan
Sashank Santhanam
Samira Shaikh
机构
[1] University of North Carolina at Charlotte,Department of Computer Science
来源
Journal of Intelligent Information Systems | 2021年 / 56卷
关键词
Deep learning; Reinforcement learning; Emotional intelligence; Human feedback; Seq2seq learning; Conversational agent; Natural language generation;
D O I
暂无
中图分类号
学科分类号
摘要
We propose a new approach towards emotional natural language generation using bidirectional seq2seq model. Our goal is to generate emotionally relevant language that accommodates the emotional tone of the prior context. To incorporate emotional information, we train our own embeddings appended with emotion values through valence, arousal and dominance scores. We use a reinforcement-learning framework, which is tuned using policy gradient method. Two of the internal rewards in our reinforcement learning framework, viz. Ease of Answering and Semantic Coherence are based on prior state-of-the-art. We propose a new internal reward, Emotional Intelligence, computed by minimizing the affective dissonance between the source and generated text. We also train a separate external reward analyzer to predict the rewards as well as to maximize the expected rewards (both internal and external). We evaluate the system on two common corpora used for Natural Language Generation tasks: the Cornell Movie Dialog and Yelp Restaurant Review Corpus. We report standard evaluation metrics including BLEU, ROUGE-L and perplexity as well as human evaluation to validate our approach. We demonstrate the ability of proposed model to generate emotionally appropriate responses on both corpora.
引用
收藏
页码:189 / 206
页数:17
相关论文
共 27 条
[1]  
Gatt A(2018)Survey of the state of the art in natural language generation: Core tasks, applications and evaluation Journal of Artificial Intelligence Research 61 65-170
[2]  
Krahmer E(1996)Reinforcement learning: A survey Journal of artificial intelligence research 4 237-285
[3]  
Kaelbling LP(2014)Emotion and language: Valence and arousal affect word recognition. Journal of Experimental Psychology: General 143 1065-167
[4]  
Littman ML(2012)Sentiment analysis and opinion mining Synthesis lectures on human language technologies 5 1-480
[5]  
Moore AW(2018)Emotion in reinforcement learning agents and robots: a survey Machine Learning 107 443-389
[6]  
Kuperman V(2018)Polite dialogue generation without parallel data Transactions of the Association of Computational Linguistics 6 373-100953
[7]  
Estes Z(2019)Emotion recognition in conversation: Research challenges, datasets, and recent advances IEEE Access 7 100943-285
[8]  
Brysbaert M(2005)The empathic companion: A character-based interface that addresses users’affective states Applied artificial intelligence 19 267-349
[9]  
Warriner AB(2014)Learning by appraising: An emotion-based approach to intrinsic reward design Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems 22 330-1207
[10]  
Liu B(2013)Norms of valence, arousal, and dominance for 13,915 english lemmas Behavior Research Methods 45 1191-undefined