Reinforcement Learning for Dialog Management using Least-Squares Policy Iteration and Fast Feature Selection

被引:0
作者
Li, Lihong [1 ]
Williams, Jason D. [2 ]
Balakrishnan, Suhrid [2 ]
机构
[1] Rutgers State Univ, Dept Comp Sci, Piscataway, NJ 08854 USA
[2] AT&T Labs Res, 180 Pk Ave, Florham Pk, NJ 07932 USA
来源
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5 | 2009年
关键词
Dialog management; spoken dialog systems; partially observable Markov decision processes;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning (RL) is a promising technique for creating a dialog manager. RL accepts features of the current dialog state and seeks to find the best action given those features. Although it is often easy to posit a large set of potentially useful features, in practice, it is difficult to find the subset which is large enough to contain useful information yet compact enough to reliably learn a good policy. In this paper, we propose a method for RL optimization which automatically performs feature selection. The algorithm is based on least-squares policy iteration. a state-of-the-art RL algorithm which is highly sample-efficient and can learn from a static corpus or on-line. Experiments in dialog simulation show it is more stable than a baseline RL algorithm taken from a working dialog system.
引用
收藏
页码:2447 / +
页数:2
相关论文
共 15 条
[1]  
FRAMPTON M, 2008, P ICASSP LAS VEG
[2]   Hybrid Reinforcement/Supervised Learning of Dialogue Policies from Fixed Data Sets [J].
Henderson, James ;
Lemon, Oliver ;
Georgila, Kallirroi .
COMPUTATIONAL LINGUISTICS, 2008, 34 (04) :487-511
[3]  
Lagoudakis M. G., 2003, J. Mach. Learn. Res.
[4]   A stochastic model of human-machine interaction for learning dialog strategies [J].
Levin, E ;
Pieraccini, R ;
Eckert, W .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (01) :11-23
[5]  
Mahadevan S, 2007, J MACH LEARN RES, V8, P2169
[6]  
Parr R., 2007, ICML 07
[7]  
RIESER V, 2008, P LREC MARR
[8]  
Roy N, 2000, 38TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, P93
[9]  
Sutton R. S., 1988, Machine Learning, V3, P9, DOI 10.1007/BF00115009
[10]  
Sutton R.S., 1998, Introduction to reinforcement learning, V2