On the linear convergence of policy gradient under Hadamard parameterization

被引：0

作者：

Liu, Jiacai ^{[1
]}

Chen, Jinchi ^{[2
]}

Wei, Ke ^{[1
]}

机构：

[1] Fudan Univ, Sch Data Sci, Shanghai 200433, Peoples R China

[2] East China Univ Sci & Technol, Sch Math, Shanghai 200433, Peoples R China

来源：

INFORMATION AND INFERENCE-A JOURNAL OF THE IMA | 2025年 / 14卷 / 01期

关键词：

policy gradient; Hadamard parameterization; linear convergence; sub-optimal probability;

D O I：

10.1093/imaiai/iaaf003

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

The convergence of deterministic policy gradient under the Hadamard parameterization is studied in the tabular setting and the linear convergence of the algorithm is established. To this end, we first show that the error decreases at an $O(\frac{1}{k})$ rate for all the iterations. Based on this result, we further show that the algorithm has a faster local linear convergence rate after $k_{0}$ iterations, where $k_{0}$ is a constant that only depends on the MDP problem and the initialization. To show the local linear convergence of the algorithm, we have indeed established the contraction of the sub-optimal probability $b_{s}<^>{k}$ (i.e. the probability of the output policy $\pi <^>{k}$ on non-optimal actions) when $k\ge k_{0}$.

引用

页数：38

共 50 条

[1] CONVERGENCE OF ENTROPY-REGULARIZED NATURAL POLICY GRADIENT WITH LINEAR FUNCTION APPROXIMATION
Cayci, Semih
He, Niao
Srikant, R.
SIAM JOURNAL ON OPTIMIZATION, 2024, 34 (03) : 2729 - 2755
[2] On the Convergence Rates of Policy Gradient Methods
Xiao, Lin
JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
[3] Convergence of policy gradient for stochastic linear quadratic optimal control problems in infinite horizon
Zhang, Xinpei
Jia, Guangyan
JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2025, 547 (01)
[4] LINEAR CONVERGENCE OF A POLICY GRADIENT METHOD FOR SOME FINITE HORIZON CONTINUOUS TIME CONTROL PROBLEMS
Reisinger, Christoph
Stockinger, Wolfgang
Zhang, Yufei
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2023, 61 (06) : 3526 - 3558
[5] Decentralized Proximal Gradient Algorithms With Linear Convergence Rates
Alghunaim, Sulaiman A.
Ryu, Ernest K.
Yuan, Kun
Sayed, Ali H.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (06) : 2787 - 2794
[6] A Compressed Gradient Tracking Method for Decentralized Optimization With Linear Convergence
Liao, Yiwei
Li, Zhuorui
Huang, Kun
Pu, Shi
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (10) : 5622 - 5629
[7] Nonconvex Proximal Incremental Aggregated Gradient Method with Linear Convergence
Peng, Wei
Zhang, Hui
Zhang, Xiaoya
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2019, 183 (01) : 230 - 245
[8] Linear convergence of a nonmonotone projected gradient method for multiobjective optimization
Zhao, Xiaopeng
Yao, Jen-Chih
JOURNAL OF GLOBAL OPTIMIZATION, 2022, 82 (03) : 577 - 594
[9] On linear convergence of exponential sign-based gradient descent
He, Kangchen
Qu, Zhihai
Li, Xiuxian
Xu, Jia
JOURNAL OF CONTROL AND DECISION, 2025,
[10] Nonconvex Proximal Incremental Aggregated Gradient Method with Linear Convergence
Wei Peng
Hui Zhang
Xiaoya Zhang
Journal of Optimization Theory and Applications, 2019, 183 : 230 - 245

← 1 2 3 4 5 →