Stochastic Variance-Reduced Policy Gradient

被引:0
|
作者
Papini, Matteo [1 ]
Binaghi, Damiano [1 ]
Canonaco, Giuseppe [1 ]
Pirotta, Matteo [2 ]
Restelli, Marcello [1 ]
机构
[1] Politecn Milan, Milan, Italy
[2] INRIA, Lille, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a novel reinforcement-learning algorithm consisting in a stochastic variance-reduced version of policy gradient for solving Markov Decision Processes (MDPs). Stochastic variance-reduced gradient (SVRG) methods have proven to be very successful in supervised learning. However, their adaptation to policy gradient is not straightforward and needs to account for I) a non-concave objective function; II) approximations in the full gradient computation; and III) a non-stationary sampling process. The result is SVRPG, a stochastic variance-reduced policy gradient algorithm that leverages on importance weights to preserve the unbiasedness of the gradient estimate. Under standard assumptions on the MDP, we provide convergence guarantees for SVRPG with a convergence rate that is linear under increasing batch sizes. Finally, we suggest practical variants of SVRPG, and we empirically evaluate them on continuous MDPs.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] PrivSGP-VR: Differentially Private Variance-Reduced Stochastic Gradient Push with Tight Utility Bounds
    Zhu, Zehan
    Huang, Yan
    Wang, Xin
    Xu, Jinming
    PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 5743 - 5752
  • [42] Variance-Reduced Stochastic Optimization for Efficient Inference of Hidden Markov Models
    Sidrow, Evan
    Heckman, Nancy
    Bouchard-Cote, Alexandre
    Fortune, Sarah M. E.
    Trites, Andrew W.
    Auger-Methe, Marie
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2025, 34 (01) : 222 - 238
  • [43] Variance-Reduced Stochastic Quasi-Newton Methods for Decentralized Learning
    Zhang, Jiaojiao
    Liu, Huikang
    So, Anthony Man-Cho
    Ling, Qing
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2023, 71 : 311 - 326
  • [44] A DECENTRALIZED VARIANCE-REDUCED METHOD FOR STOCHASTIC OPTIMIZATION OVER DIRECTED GRAPHS
    Qureshi, Muhammad, I
    Xin, Ran
    Kar, Soummya
    Khan, Usman A.
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5030 - 5034
  • [45] Sampling from Non-Log-Concave Distributions via Stochastic Variance-Reduced Gradient Langevin Dynamics
    Zou, Difan
    Xu, Pan
    Gu, Quanquan
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [46] Variance-Reduced Stochastic Learning by Networked Agents Under Random Reshuffling
    Yuan, Kun
    Ying, Bicheng
    Liu, Jiageng
    Sayed, Ali H.
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2019, 67 (02) : 351 - 366
  • [47] Variance-Reduced Shuffling Gradient Descent With Momentum for Finite-Sum Minimization
    Jiang, Xia
    Zeng, Xianlin
    Xi, Lihua
    Sun, Jian
    IEEE CONTROL SYSTEMS LETTERS, 2023, 7 : 1700 - 1705
  • [48] Variance-reduced reshuffling gradient descent for nonconvex optimization: Centralized and distributed algorithms
    Jiang, Xia
    Zeng, Xianlin
    Xie, Lihua
    Sun, Jian
    Chen, Jie
    AUTOMATICA, 2025, 171
  • [49] Stochastic variance-reduced prox-linear algorithms for nonconvex composite optimization
    Junyu Zhang
    Lin Xiao
    Mathematical Programming, 2022, 195 : 649 - 691
  • [50] A Hybrid Variance-Reduced Method for Decentralized Stochastic Non-Convex Optimization
    Xin, Ran
    Khan, Usman A.
    Kar, Soummya
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139