Cross-entropic learning of a machine for the decision in a partially observable universe

被引:12
作者
Dambreville, Frederic [1 ]
机构
[1] GIP, ASC, CEP, DET,DGA, F-94114 Arcueil, France
关键词
control; Markov decision process/partially observable Markov decision Process; hierarchical hidden Markov models; Bayesian networks; cross-entropy;
D O I
10.1007/s10898-006-9061-9
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
In this paper, we are interested in optimal decisions in a partially observable universe. Our approach is to directly approximate an optimal strategic tree depending on the observation. This approximation is made by means of a parameterized probabilistic law. A particular family of Hidden Markov Models (HMM), with input and output, is considered as a model of policy. A method for optimizing the parameters of these HMMs is proposed and applied. This optimization is based on the cross-entropic (CE) principle for rare events simulation developed by Rubinstein.
引用
收藏
页码:541 / 555
页数:15
相关论文
共 12 条
[1]  
[Anonymous], 1971, THESIS I OPERATIONS
[2]  
[Anonymous], 2004, PROC C INTELLIGENT A
[3]  
Bellman R., 1957, DYNAMIC PROGRAMMING
[4]  
CASSANDRA AR, 1998, THESIS BROWN U RHODE
[5]  
DEBOER PT, TUTORIAL CROSS ENTRO
[6]   The hierarchical hidden Markov model: Analysis and applications [J].
Fine, S ;
Singer, Y ;
Tishby, N .
MACHINE LEARNING, 1998, 32 (01) :41-62
[7]  
HOMEMDEMELLO T, RARE EVENT ESTIMATIO
[8]  
Meuleau N, 1999, UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, P427
[9]  
MURPHY K, 2001, P NEUR INF PROC SYST
[10]  
Rubinstein R. Y., 2004, CROSS ENTROPY METHOD