Reinforcement Learning for Dialog Management using Least-Squares Policy Iteration and Fast Feature Selection

被引：0

作者：

Li, Lihong ^{[1
]}

Williams, Jason D. ^{[2
]}

Balakrishnan, Suhrid ^{[2
]}

机构：

[1] Rutgers State Univ, Dept Comp Sci, Piscataway, NJ 08854 USA

[2] AT&T Labs Res, 180 Pk Ave, Florham Pk, NJ 07932 USA

来源：

INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5 | 2009年

关键词：

Dialog management; spoken dialog systems; partially observable Markov decision processes;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning (RL) is a promising technique for creating a dialog manager. RL accepts features of the current dialog state and seeks to find the best action given those features. Although it is often easy to posit a large set of potentially useful features, in practice, it is difficult to find the subset which is large enough to contain useful information yet compact enough to reliably learn a good policy. In this paper, we propose a method for RL optimization which automatically performs feature selection. The algorithm is based on least-squares policy iteration. a state-of-the-art RL algorithm which is highly sample-efficient and can learn from a static corpus or on-line. Experiments in dialog simulation show it is more stable than a baseline RL algorithm taken from a working dialog system.

引用

页码：2447 / +

页数：2

共 15 条

[1]

FRAMPTON M, 2008, P ICASSP LAS VEG

[2] Hybrid Reinforcement/Supervised Learning of Dialogue Policies from Fixed Data Sets [J].

Henderson, James ;

Lemon, Oliver ;

Georgila, Kallirroi .

COMPUTATIONAL LINGUISTICS, 2008, 34 (04) :487-511

[3]

Lagoudakis M. G., 2003, J. Mach. Learn. Res.

[4] A stochastic model of human-machine interaction for learning dialog strategies [J].

Levin, E ;

Pieraccini, R ;

Eckert, W .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (01) :11-23

[5]

Mahadevan S, 2007, J MACH LEARN RES, V8, P2169

[6]

Parr R., 2007, ICML 07

[7]

RIESER V, 2008, P LREC MARR

[8]

Roy N, 2000, 38TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, P93

[9]

Sutton R. S., 1988, Machine Learning, V3, P9, DOI 10.1007/BF00115009

[10]

Sutton R.S., 1998, Introduction to reinforcement learning, V2

← 1 2 →