POLICY EVALUATION WITH STOCHASTIC GRADIENT ESTIMATION TECHNIQUES

被引：0

作者：

Zhou, Yi ^{[1
]}

Fu, Michael C. ^{[2
]}

Ryzhov, Ilya O.

机构：

[1] Univ Maryland, Inst Syst Res, Dept Math, 8223 Paint Branch Dr, College Pk, MD 20742 USA

[2] Univ Maryland, Inst Syst Res, Robert H Smith Sch Business, 7699 Mowatt Ln, College Pk, MD 20742 USA

来源：

2022 WINTER SIMULATION CONFERENCE (WSC) | 2022年

关键词：

OPTIMIZATION;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

In this paper, we consider policy evaluation in a finite-horizon setting with continuous state variables. The Bellman equation represents the value function as a conditional expectation, which can be further transformed into a ratio of two stochastic gradients. By using the finite difference method and the generalized likelihood ratio method, we propose new estimators for policy evaluation and show how the value of any given state can be estimated using sample paths starting from various other states.

引用

页码：3039 / 3050

页数：12

共 50 条

[41] A STOCHASTIC COMPOSITIONAL GRADIENT METHOD USING MARKOV SAMPLES
Wang, Mengdi
Liu, Ji
2016 WINTER SIMULATION CONFERENCE (WSC), 2016, : 702 - 713
[42] Stochastic Compositional Gradient Descent Under Compositional Constraints
Thomdapu, Srujan Teja
Vardhan, Harsh
Rajawat, Ketan
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2023, 71 : 1115 - 1127
[43] Stochastic Proximal Gradient Consensus Over Random Networks
Hong, Mingyi
Chang, Tsung-Hui
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2017, 65 (11) : 2933 - 2948
[44] Stochastic Gradient Descent with Polyak's Learning Rate
Prazeres, Mariana
Oberman, Adam M.
JOURNAL OF SCIENTIFIC COMPUTING, 2021, 89 (01)
[45] Regression Models Augmented with Direct Stochastic Gradient Estimators
Fu, Michael C.
Qu, Huashuai
INFORMS JOURNAL ON COMPUTING, 2014, 26 (03) : 484 - 499
[46] Distributed Stochastic Gradient Descent With Compressed and Skipped Communication
Phuong, Tran Thi
Phong, Le Trieu
Fukushima, Kazuhide
IEEE ACCESS, 2023, 11 : 99836 - 99846
[47] Katyusha: The First Direct Acceleration of Stochastic Gradient Methods
Allen-Zhu, Zeyuan
JOURNAL OF MACHINE LEARNING RESEARCH, 2018, 18
[48] Adjusted stochastic gradient descent for latent factor analysis
Li, Qing
Xiong, Diwen
Shang, Mingsheng
INFORMATION SCIENCES, 2022, 588 : 196 - 213
[49] On the Convergence of Decentralized Stochastic Gradient Descent With Biased Gradients
Jiang, Yiming
Kang, Helei
Liu, Jinlan
Xu, Dongpo
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2025, 73 : 549 - 558
[50] Katyusha: The First Direct Acceleration of Stochastic Gradient Methods
Allen-Zhu, Zeyuan
STOC'17: PROCEEDINGS OF THE 49TH ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2017, : 1200 - 1205

← 1 2 3 4 5 →