Cross-entropic learning of a machine for the decision in a partially observable universe

被引：12

作者：

Dambreville, Frederic ^{[1
]}

机构：

[1] GIP, ASC, CEP, DET,DGA, F-94114 Arcueil, France

来源：

JOURNAL OF GLOBAL OPTIMIZATION | 2007年 / 37卷 / 04期

关键词：

control; Markov decision process/partially observable Markov decision Process; hierarchical hidden Markov models; Bayesian networks; cross-entropy;

D O I：

10.1007/s10898-006-9061-9

中图分类号：

C93 [管理学]; O22 [运筹学];

学科分类号：

070105 ; 12 ; 1201 ; 1202 ; 120202 ;

摘要：

In this paper, we are interested in optimal decisions in a partially observable universe. Our approach is to directly approximate an optimal strategic tree depending on the observation. This approximation is made by means of a parameterized probabilistic law. A particular family of Hidden Markov Models (HMM), with input and output, is considered as a model of policy. A method for optimizing the parameters of these HMMs is proposed and applied. This optimization is based on the cross-entropic (CE) principle for rare events simulation developed by Rubinstein.

引用

页码：541 / 555

页数：15

共 12 条

[1]

[Anonymous], 1971, THESIS I OPERATIONS

[2]

[Anonymous], 2004, PROC C INTELLIGENT A

[3]

Bellman R., 1957, DYNAMIC PROGRAMMING

[4]

CASSANDRA AR, 1998, THESIS BROWN U RHODE

[5]

DEBOER PT, TUTORIAL CROSS ENTRO

[6] The hierarchical hidden Markov model: Analysis and applications [J].

Fine, S ;

Singer, Y ;

Tishby, N .

MACHINE LEARNING, 1998, 32 (01) :41-62

[7]

HOMEMDEMELLO T, RARE EVENT ESTIMATIO

[8]

Meuleau N, 1999, UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, P427

[9]

MURPHY K, 2001, P NEUR INF PROC SYST

[10]

Rubinstein R. Y., 2004, CROSS ENTROPY METHOD

← 1 2 →