On the linear convergence of policy gradient under Hadamard parameterization

被引：0

作者：

Liu, Jiacai ^{[1
]}

Chen, Jinchi ^{[2
]}

Wei, Ke ^{[1
]}

机构：

[1] Fudan Univ, Sch Data Sci, Shanghai 200433, Peoples R China

[2] East China Univ Sci & Technol, Sch Math, Shanghai 200433, Peoples R China

来源：

INFORMATION AND INFERENCE-A JOURNAL OF THE IMA | 2025年 / 14卷 / 01期

关键词：

policy gradient; Hadamard parameterization; linear convergence; sub-optimal probability;

D O I：

10.1093/imaiai/iaaf003

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

The convergence of deterministic policy gradient under the Hadamard parameterization is studied in the tabular setting and the linear convergence of the algorithm is established. To this end, we first show that the error decreases at an $O(\frac{1}{k})$ rate for all the iterations. Based on this result, we further show that the algorithm has a faster local linear convergence rate after $k_{0}$ iterations, where $k_{0}$ is a constant that only depends on the MDP problem and the initialization. To show the local linear convergence of the algorithm, we have indeed established the contraction of the sub-optimal probability $b_{s}<^>{k}$ (i.e. the probability of the output policy $\pi <^>{k}$ on non-optimal actions) when $k\ge k_{0}$.

引用

页数：38

共 50 条

[41] Policy Gradient MaxSAT Solver
Gutierrez-De-La-Paz, Omar
Menchaca-Mendez, Ricardo
Zamora-Gomez, Erik
Corona-Bermudez, Uriel
Menchaca-Mendez, Rolando
Gutierrez-De-La-Paz, Bruno
COMPUTACION Y SISTEMAS, 2024, 28 (02): : 353 - 366
[42] On Faster Convergence of Scaled Sign Gradient Descent
Li, Xiuxian
Lin, Kuo-Yi
Li, Li
Hong, Yiguang
Chen, Jie
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (02) : 1732 - 1741
[43] On the linear convergence of circumcentered isometry methods
Bauschke, Heinz H.
Ouyang, Hui
Wang, Xianfu
NUMERICAL ALGORITHMS, 2021, 87 (01) : 263 - 297
[44] Transforming logarithmic to linear convergence by interpolation
Homeier, HHH
APPLIED MATHEMATICS LETTERS, 1999, 12 (02) : 13 - 17
[45] Restart FISTA with Global Linear Convergence
Alamo, Teodoro
Krupa, Pablo
Limon, Daniel
2019 18TH EUROPEAN CONTROL CONFERENCE (ECC), 2019, : 1969 - 1974
[46] On the linear convergence of circumcentered isometry methods
Heinz H. Bauschke
Hui Ouyang
Xianfu Wang
Numerical Algorithms, 2021, 87 : 263 - 297
[47] On the Linear Convergence of Two Decentralized Algorithms
Yao Li
Ming Yan
Journal of Optimization Theory and Applications, 2021, 189 : 271 - 290
[48] On the Linear Convergence of Two Decentralized Algorithms
Li, Yao
Yan, Ming
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2021, 189 (01) : 271 - 290
[49] ON THE LINEAR CONVERGENCE OF THE MULTIMARGINAL SINKHORN ALGORITHM
Carlier, Guillaume
SIAM JOURNAL ON OPTIMIZATION, 2022, 32 (02) : 786 - 794
[50] ON THE LINEAR CONVERGENCE OF A BREGMAN PROXIMAL POINT
Guo, Ke
Zhu, Chunrong
JOURNAL OF NONLINEAR AND VARIATIONAL ANALYSIS, 2022, 6 (02): : 5 - 14

← 1 2 3 4 5 →