Adaptive policies for time-varying stochastic systems under discounted criterion

被引:9
作者
Hilgert, N
Minjárez-Sosa, JA
机构
[1] ENSAM, INRA, Lab Biometrie, F-34060 Montpellier 1, France
[2] Univ Sonora, Dept Matemat, Hermosillo 83000, Sonora, Mexico
关键词
non-homogeneous Markov control processes; discrete-time stochastic systems; discounted cost criterion; optimal adaptive policy;
D O I
10.1007/s001860100170
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
We consider a class of time-varying stochastic control systems, with Borel state and action spaces, and possibly unbounded costs. The processes evolve according to a discrete-time equation x(n+1) = G(n)(x(n), a(n), xi(n)), n = 0, 1,..., where the xi(n) are i.i.d. R-k-valued random vectors whose common density is unknown, and the G, are given functions converging, in a restricted way, to some function Ginfinity as n --> infinity. Assuming observability of xi(n), we construct an adaptive policy which is asymptotically discounted cost optimal for the limiting control system x(n+1) = Ginfinity(x(n), a(n), xi(n)).
引用
收藏
页码:491 / 505
页数:15
相关论文
共 18 条
[1]  
Bastin G, 1990, ON LINE ESTIMATION A, V1
[2]  
DUFLO M., 1997, Random Iterative Models
[3]  
Dynkin E.B., 1979, Grundlehren der Mathematischen Wissenschaften, V235
[4]  
Gordienko EI, 1998, KYBERNETIKA, V34, P217
[5]   Adaptive control for discrete-time Markov processes with unbounded costs: Average criterion [J].
Gordienko, EI ;
Minjarez-Sosa, JA .
MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 1998, 48 (01) :37-55
[6]   ON DENSITY-ESTIMATION IN THE VIEW OF KOLMOGOROV IDEAS IN APPROXIMATION-THEORY [J].
HASMINSKII, R ;
IBRAGIMOV, I .
ANNALS OF STATISTICS, 1990, 18 (03) :999-1010
[7]  
Hernandez-Lerma O., 1999, FURTHER TOPICS DISCR
[8]  
Hernandez-Lerma O., 1989, ADAPTIVE MARKOV CONT, DOI DOI 10.1007/978-1-4419-8714-3
[9]  
HERNANDEZLERMA O, 1992, KYBERNETIKA, V28, P191
[10]   Policy iteration for average cost Markov control processes on Borel spaces [J].
HernandezLerma, O ;
Lasserre, JB .
ACTA APPLICANDAE MATHEMATICAE, 1997, 47 (02) :125-154