Evaluation of policy gradient methods and variants on the cart-pole benchmark

被引:29
|
作者
Riedmiller, Martin [1 ]
Peters, Jan [2 ]
Schaal, Stefan [2 ]
机构
[1] Univ Osnabruck, Neuroinformat Grp, D-4500 Osnabruck, Germany
[2] Univ Southern Calif, Computat Learning & Motor Control, Los Angeles, CA 90007 USA
来源
2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING | 2007年
关键词
D O I
10.1109/ADPRL.2007.368196
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, 'vanilla' policy gradients and natural policy gradients. Each of these methods is first presented in its simple form and subsequently refined and optimized. By carrying out numerous experiments on the cart pole regulator benchmark we aim to provide a useful baseline for future research on parameterized policy search algorithms. Portable C++ code is provided for both plant and algorithms; thus, the results in this paper can be reevaluated, reused and new algorithms can be inserted with ease.
引用
收藏
页码:254 / +
页数:2
相关论文
共 50 条
  • [1] Fuzzy Supervisory Control of a Cart-Pole System
    Li, Jen-Hsing
    11TH IEEE INTERNATIONAL CONFERENCE ON CONTROL AND AUTOMATION (ICCA), 2014, : 435 - 439
  • [2] A speed regulator for a force-driven cart-pole system
    Sandoval, Jesus
    Kelly, Rafael
    Santibanez, Victor
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2022, 53 (02) : 412 - 430
  • [3] A Surrogate Model-based Genetic Algorithm for the Optimal Policy in Cart-pole Balancing Environments
    Shin, Seung-Soo
    Kim, Yong-Hyuk
    PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2022, 2022, : 503 - 505
  • [4] Comparison of Reinforcement Learning Algorithms applied to the Cart-Pole Problem
    Nagendra, Savinay
    Podila, Nikhil
    Ugarakhod, Rashmi
    George, Koshy
    2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 26 - 32
  • [5] Stabilizing a Vehicle near Rollover: An Analogy to Cart-Pole Stabilization
    Peters, Steven C.
    Bobrow, James E.
    Iagnemma, Karl
    2010 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2010, : 5194 - 5200
  • [6] Adaptive dynamic surface control of cart-pole inverted pendulum
    Huang H.-X.
    Ding C.
    Liu J.-T.
    Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2019, 36 (06): : 1002 - 1008
  • [7] Optimization and control of a pendulum-driven cart-pole system
    Liu, Yang
    Yu, Hongnian
    Burrows, Brian
    2007 IEEE INTERNATIONAL CONFERENCE ON NETWORKING, SENSING, AND CONTROL, VOLS 1 AND 2, 2007, : 151 - +
  • [8] Optimal Solution of Kinodynamic Motion Planning for the Cart-Pole System
    Boriero, Fabrizio
    Sansonetto, Nicola
    Marigonda, Antonio
    Muradore, Riccardo
    Fiorini, Paolo
    IFAC PAPERSONLINE, 2017, 50 (01): : 6308 - 6313
  • [9] Opponent cart-pole dynamics for reinforcement learning of competing agents
    Huang, Xun
    ACTA MECHANICA SINICA, 2022, 38 (05)
  • [10] A Distributed Iterative LQR Approach for a Cart-Pole Network Synchronization
    Rodriguez-Gil, J. A.
    Arevalo-Castiblanco, M. F.
    Tellez-Castro, D.
    Mojica-Nava, E.
    2021 IEEE 5TH COLOMBIAN CONFERENCE ON AUTOMATIC CONTROL (CCAC): TECHNOLOGICAL ADVANCES FOR A SUSTAINABLE REGIONAL DEVELOPMENT, 2021, : 151 - 156