On the linear convergence of policy gradient under Hadamard parameterization

被引：0

作者：

Liu, Jiacai ^{[1
]}

Chen, Jinchi ^{[2
]}

Wei, Ke ^{[1
]}

机构：

[1] Fudan Univ, Sch Data Sci, Shanghai 200433, Peoples R China

[2] East China Univ Sci & Technol, Sch Math, Shanghai 200433, Peoples R China

来源：

INFORMATION AND INFERENCE-A JOURNAL OF THE IMA | 2025年 / 14卷 / 01期

关键词：

policy gradient; Hadamard parameterization; linear convergence; sub-optimal probability;

D O I：

10.1093/imaiai/iaaf003

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

The convergence of deterministic policy gradient under the Hadamard parameterization is studied in the tabular setting and the linear convergence of the algorithm is established. To this end, we first show that the error decreases at an $O(\frac{1}{k})$ rate for all the iterations. Based on this result, we further show that the algorithm has a faster local linear convergence rate after $k_{0}$ iterations, where $k_{0}$ is a constant that only depends on the MDP problem and the initialization. To show the local linear convergence of the algorithm, we have indeed established the contraction of the sub-optimal probability $b_{s}<^>{k}$ (i.e. the probability of the output policy $\pi <^>{k}$ on non-optimal actions) when $k\ge k_{0}$.

引用

页数：38

共 50 条

[31] Level-Set Subdifferential Error Bounds and Linear Convergence of Bregman Proximal Gradient Method
Daoli Zhu
Sien Deng
Minghua Li
Lei Zhao
Journal of Optimization Theory and Applications, 2021, 189 : 889 - 918
[32] Linear convergence of the relaxed gradient projection algorithm for solving the split equality problems in Hilbert spaces
Tian, Tingting
Shi, Luoyi
Chen, Rudong
JOURNAL OF INEQUALITIES AND APPLICATIONS, 2019, 2019 (1)
[33] Linear convergence of the relaxed gradient projection algorithm for solving the split equality problems in Hilbert spaces
Tingting Tian
Luoyi Shi
Rudong Chen
Journal of Inequalities and Applications, 2019
[34] New analysis of linear convergence of gradient-type methods via unifying error bound conditions
Zhang, Hui
MATHEMATICAL PROGRAMMING, 2020, 180 (1-2) : 371 - 416
[35] New analysis of linear convergence of gradient-type methods via unifying error bound conditions
Hui Zhang
Mathematical Programming, 2020, 180 : 371 - 416
[36] Linear convergence of inexact descent method and inexact proximal gradient algorithms for lower-order regularization problems
Yaohua Hu
Chong Li
Kaiwen Meng
Xiaoqi Yang
Journal of Global Optimization, 2021, 79 : 853 - 883
[37] Linear convergence of inexact descent method and inexact proximal gradient algorithms for lower-order regularization problems
Hu, Yaohua
Li, Chong
Meng, Kaiwen
Yang, Xiaoqi
JOURNAL OF GLOBAL OPTIMIZATION, 2021, 79 (04) : 853 - 883
[38] Linear Convergence of Projection Algorithms
Dao, Minh N.
Phan, Hung M.
MATHEMATICS OF OPERATIONS RESEARCH, 2019, 44 (02) : 715 - 738
[39] Linear convergence of cyclic SAGA
Youngsuk Park
Ernest K. Ryu
Optimization Letters, 2020, 14 : 1583 - 1598
[40] Linear convergence of cyclic SAGA
Park, Youngsuk
Ryu, Ernest K.
OPTIMIZATION LETTERS, 2020, 14 (06) : 1583 - 1598

← 1 2 3 4 5 →