Reinforcement Learning with Experience Replay for Model-Free Humanoid Walking Optimization

被引:8
作者
Wawrzynski, Pawel [1 ]
机构
[1] Warsaw Univ Technol, Inst Control & Computat Engn, PL-00665 Warsaw, Poland
关键词
Reinforcement learning; learning in robots; humanoids; bipedal walking; CONVERGENCE; ROBOTS;
D O I
10.1142/S0219843614500248
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
In this paper, a control system for humanoid robot walking is approximately optimized by means of reinforcement learning. Given is a 18 DOF humanoid whose gait is based on replaying a simple trajectory. This trajectory is translated into a reactive policy. A neural network whose input represents the robot state learns to produce appropriate output that additively modifies the initial control. The learning algorithm applied is actor critic with experience replay. In 50 min of learning, the slow initial gait changes to a dexterous and fast walking. No model of the robot dynamics is engaged. The methodology in use is generic and can be applied to optimize control systems for diverse robots of comparable complexity.
引用
收藏
页数:21
相关论文
共 50 条
[21]   Model-free reinforcement learning from expert demonstrations: a survey [J].
Jorge Ramírez ;
Wen Yu ;
Adolfo Perrusquía .
Artificial Intelligence Review, 2022, 55 :3213-3241
[22]   On Distributed Model-Free Reinforcement Learning Control with Stability Guarantee [J].
Mukherjee, Sayak ;
Thanh Long Vu .
2021 AMERICAN CONTROL CONFERENCE (ACC), 2021, :2175-2180
[23]   Model-free Control for Stratospheric Airship Based on Reinforcement Learning [J].
Nie, Chunyu ;
Zhu, Ming ;
Zheng, Zewei ;
Wu, Zhe .
PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, :10702-10707
[24]   An Hybrid Model-Free Reinforcement Learning Approach for HVAC Control [J].
Solinas, Francesco M. ;
Bellagarda, Andrea ;
Macii, Enrico ;
Patti, Edoardo ;
Bottaccioli, Lorenzo .
2021 21ST IEEE INTERNATIONAL CONFERENCE ON ENVIRONMENT AND ELECTRICAL ENGINEERING AND 2021 5TH IEEE INDUSTRIAL AND COMMERCIAL POWER SYSTEMS EUROPE (EEEIC/I&CPS EUROPE), 2021,
[25]   Enhanced Off-Policy Reinforcement Learning With Focused Experience Replay [J].
Kong, Seung-Hyun ;
Nahrendra, I. Made Aswin ;
Paek, Dong-Hee .
IEEE ACCESS, 2021, 9 (09) :93152-93164
[26]   Model-Free Reinforcement Learning for Fully Cooperative Consensus Problem of Nonlinear Multiagent Systems [J].
Wang, Hong ;
Li, Man .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (04) :1482-1491
[27]   DDPG Reinforcement Learning Experiment for Improving the Stability of Bipedal Walking of Humanoid Robots [J].
Chun, Yeonghun ;
Choi, Junghun ;
Min, Injoon ;
Ahn, Minsung ;
Han, Jeakweon .
2023 IEEE/SICE INTERNATIONAL SYMPOSIUM ON SYSTEM INTEGRATION, SII, 2023,
[28]   Clustering experience replay for the effective exploitation in reinforcement learning [J].
Li, Min ;
Huang, Tianyi ;
Zhu, William .
PATTERN RECOGNITION, 2022, 131
[29]   Policy Gradient Adaptive Critic Designs for Model-Free Optimal Tracking Control With Experience Replay [J].
Lin, Mingduo ;
Zhao, Bo ;
Liu, Derong .
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2022, 52 (06) :3692-3703
[30]   Hybrid control for combining model-based and model-free reinforcement learning [J].
Pinosky, Allison ;
Abraham, Ian ;
Broad, Alexander ;
Argall, Brenna ;
Murphey, Todd D. .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2023, 42 (06) :337-355