POLICY EVALUATION WITH STOCHASTIC GRADIENT ESTIMATION TECHNIQUES

被引:0
|
作者
Zhou, Yi [1 ]
Fu, Michael C. [2 ]
Ryzhov, Ilya O.
机构
[1] Univ Maryland, Inst Syst Res, Dept Math, 8223 Paint Branch Dr, College Pk, MD 20742 USA
[2] Univ Maryland, Inst Syst Res, Robert H Smith Sch Business, 7699 Mowatt Ln, College Pk, MD 20742 USA
来源
2022 WINTER SIMULATION CONFERENCE (WSC) | 2022年
关键词
OPTIMIZATION;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we consider policy evaluation in a finite-horizon setting with continuous state variables. The Bellman equation represents the value function as a conditional expectation, which can be further transformed into a ratio of two stochastic gradients. By using the finite difference method and the generalized likelihood ratio method, we propose new estimators for policy evaluation and show how the value of any given state can be estimated using sample paths starting from various other states.
引用
收藏
页码:3039 / 3050
页数:12
相关论文
共 50 条
  • [41] A STOCHASTIC COMPOSITIONAL GRADIENT METHOD USING MARKOV SAMPLES
    Wang, Mengdi
    Liu, Ji
    2016 WINTER SIMULATION CONFERENCE (WSC), 2016, : 702 - 713
  • [42] Stochastic Compositional Gradient Descent Under Compositional Constraints
    Thomdapu, Srujan Teja
    Vardhan, Harsh
    Rajawat, Ketan
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2023, 71 : 1115 - 1127
  • [43] Stochastic Proximal Gradient Consensus Over Random Networks
    Hong, Mingyi
    Chang, Tsung-Hui
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2017, 65 (11) : 2933 - 2948
  • [44] Stochastic Gradient Descent with Polyak's Learning Rate
    Prazeres, Mariana
    Oberman, Adam M.
    JOURNAL OF SCIENTIFIC COMPUTING, 2021, 89 (01)
  • [45] Regression Models Augmented with Direct Stochastic Gradient Estimators
    Fu, Michael C.
    Qu, Huashuai
    INFORMS JOURNAL ON COMPUTING, 2014, 26 (03) : 484 - 499
  • [46] Distributed Stochastic Gradient Descent With Compressed and Skipped Communication
    Phuong, Tran Thi
    Phong, Le Trieu
    Fukushima, Kazuhide
    IEEE ACCESS, 2023, 11 : 99836 - 99846
  • [47] Katyusha: The First Direct Acceleration of Stochastic Gradient Methods
    Allen-Zhu, Zeyuan
    JOURNAL OF MACHINE LEARNING RESEARCH, 2018, 18
  • [48] Adjusted stochastic gradient descent for latent factor analysis
    Li, Qing
    Xiong, Diwen
    Shang, Mingsheng
    INFORMATION SCIENCES, 2022, 588 : 196 - 213
  • [49] On the Convergence of Decentralized Stochastic Gradient Descent With Biased Gradients
    Jiang, Yiming
    Kang, Helei
    Liu, Jinlan
    Xu, Dongpo
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2025, 73 : 549 - 558
  • [50] Katyusha: The First Direct Acceleration of Stochastic Gradient Methods
    Allen-Zhu, Zeyuan
    STOC'17: PROCEEDINGS OF THE 49TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2017, : 1200 - 1205