Learning what to say and how to say it: Joint optimisation of spoken dialogue management and natural language generation

被引:35
作者
Lemon, Oliver [1 ]
机构
[1] Heriot Watt Univ, Edinburgh, Midlothian, Scotland
基金
英国工程与自然科学研究理事会;
关键词
Dialogue systems; Natural Language Generation; Reinforcement Learning; SYSTEMS;
D O I
10.1016/j.csl.2010.04.005
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper argues that the problems of dialogue management (DM) and Natural Language Generation (NLG) in dialogue systems are closely related and can be fruitfully treated statistically, in a joint optimisation framework such as that provided by Reinforcement Learning (RL). We first review recent results and methods in automatic learning of dialogue management strategies for spoken and multimodal dialogue systems, and then show how these techniques can also be used for the related problem of Natural Language Generation. This approach promises a number of theoretical and practical benefits such as fine-grained adaptation, generalisation, and automatic (global) optimisation, and we compare it to related work in statistical/trainable NLG. A demonstration of the proposed approach is then developed, showing combined DM and NLG policy learning for adaptive information presentation decisions. A joint DM and NLG policy learned in the framework shows a statistically significant 27% relative increase in reward over a baseline policy, which is learned in the same way only without the joint optimisation. We thereby show that that NLG problems can be approached statistically, in combination with dialogue management decisions, and we show how to jointly optimise NLG and DM using Reinforcement Learning. (C) 2010 Elsevier Ltd. All rights reserved.
引用
收藏
页码:210 / 221
页数:12
相关论文
共 42 条
[1]  
[Anonymous], P DECALOG
[2]  
BARZILAY R, 2005, P EMNLP
[3]   Generating and evaluating evaluative arguments [J].
Carenini, Giuseppe ;
Moore, Johanna D. .
ARTIFICIAL INTELLIGENCE, 2006, 170 (11) :925-952
[4]  
CUAYAHUITL H, 2009, THESIS EDINBURGH U
[5]  
DEMBERG V, 2006, P EACL
[6]  
DUBOUE PA, 2003, P EMNLP
[7]  
Frampton M, 2006, COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE, P185
[8]  
GARROD S, 2001, P BI DIAL
[9]  
Georgila K, 2006, INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, P1065
[10]   Hybrid Reinforcement/Supervised Learning of Dialogue Policies from Fixed Data Sets [J].
Henderson, James ;
Lemon, Oliver ;
Georgila, Kallirroi .
COMPUTATIONAL LINGUISTICS, 2008, 34 (04) :487-511