On the linear convergence of policy gradient under Hadamard parameterization

被引:0
作者
Liu, Jiacai [1 ]
Chen, Jinchi [2 ]
Wei, Ke [1 ]
机构
[1] Fudan Univ, Sch Data Sci, Shanghai 200433, Peoples R China
[2] East China Univ Sci & Technol, Sch Math, Shanghai 200433, Peoples R China
关键词
policy gradient; Hadamard parameterization; linear convergence; sub-optimal probability;
D O I
10.1093/imaiai/iaaf003
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
The convergence of deterministic policy gradient under the Hadamard parameterization is studied in the tabular setting and the linear convergence of the algorithm is established. To this end, we first show that the error decreases at an $O(\frac{1}{k})$ rate for all the iterations. Based on this result, we further show that the algorithm has a faster local linear convergence rate after $k_{0}$ iterations, where $k_{0}$ is a constant that only depends on the MDP problem and the initialization. To show the local linear convergence of the algorithm, we have indeed established the contraction of the sub-optimal probability $b_{s}<^>{k}$ (i.e. the probability of the output policy $\pi <^>{k}$ on non-optimal actions) when $k\ge k_{0}$.
引用
收藏
页数:38
相关论文
共 50 条
  • [41] Policy Gradient MaxSAT Solver
    Gutierrez-De-La-Paz, Omar
    Menchaca-Mendez, Ricardo
    Zamora-Gomez, Erik
    Corona-Bermudez, Uriel
    Menchaca-Mendez, Rolando
    Gutierrez-De-La-Paz, Bruno
    COMPUTACION Y SISTEMAS, 2024, 28 (02): : 353 - 366
  • [42] On Faster Convergence of Scaled Sign Gradient Descent
    Li, Xiuxian
    Lin, Kuo-Yi
    Li, Li
    Hong, Yiguang
    Chen, Jie
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (02) : 1732 - 1741
  • [43] On the linear convergence of circumcentered isometry methods
    Bauschke, Heinz H.
    Ouyang, Hui
    Wang, Xianfu
    NUMERICAL ALGORITHMS, 2021, 87 (01) : 263 - 297
  • [44] Transforming logarithmic to linear convergence by interpolation
    Homeier, HHH
    APPLIED MATHEMATICS LETTERS, 1999, 12 (02) : 13 - 17
  • [45] Restart FISTA with Global Linear Convergence
    Alamo, Teodoro
    Krupa, Pablo
    Limon, Daniel
    2019 18TH EUROPEAN CONTROL CONFERENCE (ECC), 2019, : 1969 - 1974
  • [46] On the linear convergence of circumcentered isometry methods
    Heinz H. Bauschke
    Hui Ouyang
    Xianfu Wang
    Numerical Algorithms, 2021, 87 : 263 - 297
  • [47] On the Linear Convergence of Two Decentralized Algorithms
    Yao Li
    Ming Yan
    Journal of Optimization Theory and Applications, 2021, 189 : 271 - 290
  • [48] On the Linear Convergence of Two Decentralized Algorithms
    Li, Yao
    Yan, Ming
    JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2021, 189 (01) : 271 - 290
  • [49] ON THE LINEAR CONVERGENCE OF THE MULTIMARGINAL SINKHORN ALGORITHM
    Carlier, Guillaume
    SIAM JOURNAL ON OPTIMIZATION, 2022, 32 (02) : 786 - 794
  • [50] ON THE LINEAR CONVERGENCE OF A BREGMAN PROXIMAL POINT
    Guo, Ke
    Zhu, Chunrong
    JOURNAL OF NONLINEAR AND VARIATIONAL ANALYSIS, 2022, 6 (02): : 5 - 14