On the linear convergence of policy gradient under Hadamard parameterization

被引:0
作者
Liu, Jiacai [1 ]
Chen, Jinchi [2 ]
Wei, Ke [1 ]
机构
[1] Fudan Univ, Sch Data Sci, Shanghai 200433, Peoples R China
[2] East China Univ Sci & Technol, Sch Math, Shanghai 200433, Peoples R China
关键词
policy gradient; Hadamard parameterization; linear convergence; sub-optimal probability;
D O I
10.1093/imaiai/iaaf003
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
The convergence of deterministic policy gradient under the Hadamard parameterization is studied in the tabular setting and the linear convergence of the algorithm is established. To this end, we first show that the error decreases at an $O(\frac{1}{k})$ rate for all the iterations. Based on this result, we further show that the algorithm has a faster local linear convergence rate after $k_{0}$ iterations, where $k_{0}$ is a constant that only depends on the MDP problem and the initialization. To show the local linear convergence of the algorithm, we have indeed established the contraction of the sub-optimal probability $b_{s}<^>{k}$ (i.e. the probability of the output policy $\pi <^>{k}$ on non-optimal actions) when $k\ge k_{0}$.
引用
收藏
页数:38
相关论文
共 50 条
  • [1] CONVERGENCE OF ENTROPY-REGULARIZED NATURAL POLICY GRADIENT WITH LINEAR FUNCTION APPROXIMATION
    Cayci, Semih
    He, Niao
    Srikant, R.
    SIAM JOURNAL ON OPTIMIZATION, 2024, 34 (03) : 2729 - 2755
  • [2] On the Convergence Rates of Policy Gradient Methods
    Xiao, Lin
    JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
  • [3] Convergence of policy gradient for stochastic linear quadratic optimal control problems in infinite horizon
    Zhang, Xinpei
    Jia, Guangyan
    JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2025, 547 (01)
  • [4] LINEAR CONVERGENCE OF A POLICY GRADIENT METHOD FOR SOME FINITE HORIZON CONTINUOUS TIME CONTROL PROBLEMS
    Reisinger, Christoph
    Stockinger, Wolfgang
    Zhang, Yufei
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2023, 61 (06) : 3526 - 3558
  • [5] Decentralized Proximal Gradient Algorithms With Linear Convergence Rates
    Alghunaim, Sulaiman A.
    Ryu, Ernest K.
    Yuan, Kun
    Sayed, Ali H.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (06) : 2787 - 2794
  • [6] A Compressed Gradient Tracking Method for Decentralized Optimization With Linear Convergence
    Liao, Yiwei
    Li, Zhuorui
    Huang, Kun
    Pu, Shi
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (10) : 5622 - 5629
  • [7] Nonconvex Proximal Incremental Aggregated Gradient Method with Linear Convergence
    Peng, Wei
    Zhang, Hui
    Zhang, Xiaoya
    JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2019, 183 (01) : 230 - 245
  • [8] Linear convergence of a nonmonotone projected gradient method for multiobjective optimization
    Zhao, Xiaopeng
    Yao, Jen-Chih
    JOURNAL OF GLOBAL OPTIMIZATION, 2022, 82 (03) : 577 - 594
  • [9] On linear convergence of exponential sign-based gradient descent
    He, Kangchen
    Qu, Zhihai
    Li, Xiuxian
    Xu, Jia
    JOURNAL OF CONTROL AND DECISION, 2025,
  • [10] Nonconvex Proximal Incremental Aggregated Gradient Method with Linear Convergence
    Wei Peng
    Hui Zhang
    Xiaoya Zhang
    Journal of Optimization Theory and Applications, 2019, 183 : 230 - 245