Evaluation of policy gradient methods and variants on the cart-pole benchmark

被引：29

作者：

Riedmiller, Martin ^{[1
]}

Peters, Jan ^{[2
]}

Schaal, Stefan ^{[2
]}

机构：

[1] Univ Osnabruck, Neuroinformat Grp, D-4500 Osnabruck, Germany

[2] Univ Southern Calif, Computat Learning & Motor Control, Los Angeles, CA 90007 USA

来源：

2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING | 2007年

关键词：

D O I：

10.1109/ADPRL.2007.368196

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, 'vanilla' policy gradients and natural policy gradients. Each of these methods is first presented in its simple form and subsequently refined and optimized. By carrying out numerous experiments on the cart pole regulator benchmark we aim to provide a useful baseline for future research on parameterized policy search algorithms. Portable C++ code is provided for both plant and algorithms; thus, the results in this paper can be reevaluated, reused and new algorithms can be inserted with ease.

引用

页码：254 / +

页数：2

共 50 条

[1] Fuzzy Supervisory Control of a Cart-Pole System
Li, Jen-Hsing
11TH IEEE INTERNATIONAL CONFERENCE ON CONTROL AND AUTOMATION (ICCA), 2014, : 435 - 439
[2] A speed regulator for a force-driven cart-pole system
Sandoval, Jesus
Kelly, Rafael
Santibanez, Victor
INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2022, 53 (02) : 412 - 430
[3] A Surrogate Model-based Genetic Algorithm for the Optimal Policy in Cart-pole Balancing Environments
Shin, Seung-Soo
Kim, Yong-Hyuk
PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2022, 2022, : 503 - 505
[4] Comparison of Reinforcement Learning Algorithms applied to the Cart-Pole Problem
Nagendra, Savinay
Podila, Nikhil
Ugarakhod, Rashmi
George, Koshy
2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 26 - 32
[5] Stabilizing a Vehicle near Rollover: An Analogy to Cart-Pole Stabilization
Peters, Steven C.
Bobrow, James E.
Iagnemma, Karl
2010 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2010, : 5194 - 5200
[6] Adaptive dynamic surface control of cart-pole inverted pendulum
Huang H.-X.
Ding C.
Liu J.-T.
Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2019, 36 (06): : 1002 - 1008
[7] Optimization and control of a pendulum-driven cart-pole system
Liu, Yang
Yu, Hongnian
Burrows, Brian
2007 IEEE INTERNATIONAL CONFERENCE ON NETWORKING, SENSING, AND CONTROL, VOLS 1 AND 2, 2007, : 151 - +
[8] Optimal Solution of Kinodynamic Motion Planning for the Cart-Pole System
Boriero, Fabrizio
Sansonetto, Nicola
Marigonda, Antonio
Muradore, Riccardo
Fiorini, Paolo
IFAC PAPERSONLINE, 2017, 50 (01): : 6308 - 6313
[9] Opponent cart-pole dynamics for reinforcement learning of competing agents
Huang, Xun
ACTA MECHANICA SINICA, 2022, 38 (05)
[10] A Distributed Iterative LQR Approach for a Cart-Pole Network Synchronization
Rodriguez-Gil, J. A.
Arevalo-Castiblanco, M. F.
Tellez-Castro, D.
Mojica-Nava, E.
2021 IEEE 5TH COLOMBIAN CONFERENCE ON AUTOMATIC CONTROL (CCAC): TECHNOLOGICAL ADVANCES FOR A SUSTAINABLE REGIONAL DEVELOPMENT, 2021, : 151 - 156

← 1 2 3 4 5 →