Adaptive policies for time-varying stochastic systems under discounted criterion

被引:9
作者
Hilgert, N
Minjárez-Sosa, JA
机构
[1] ENSAM, INRA, Lab Biometrie, F-34060 Montpellier 1, France
[2] Univ Sonora, Dept Matemat, Hermosillo 83000, Sonora, Mexico
关键词
non-homogeneous Markov control processes; discrete-time stochastic systems; discounted cost criterion; optimal adaptive policy;
D O I
10.1007/s001860100170
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
We consider a class of time-varying stochastic control systems, with Borel state and action spaces, and possibly unbounded costs. The processes evolve according to a discrete-time equation x(n+1) = G(n)(x(n), a(n), xi(n)), n = 0, 1,..., where the xi(n) are i.i.d. R-k-valued random vectors whose common density is unknown, and the G, are given functions converging, in a restricted way, to some function Ginfinity as n --> infinity. Assuming observability of xi(n), we construct an adaptive policy which is asymptotically discounted cost optimal for the limiting control system x(n+1) = Ginfinity(x(n), a(n), xi(n)).
引用
收藏
页码:491 / 505
页数:15
相关论文
empty
未找到相关数据