Reinforcement Learning with Experience Replay for Model-Free Humanoid Walking Optimization

被引:8
作者
Wawrzynski, Pawel [1 ]
机构
[1] Warsaw Univ Technol, Inst Control & Computat Engn, PL-00665 Warsaw, Poland
关键词
Reinforcement learning; learning in robots; humanoids; bipedal walking; CONVERGENCE; ROBOTS;
D O I
10.1142/S0219843614500248
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
In this paper, a control system for humanoid robot walking is approximately optimized by means of reinforcement learning. Given is a 18 DOF humanoid whose gait is based on replaying a simple trajectory. This trajectory is translated into a reactive policy. A neural network whose input represents the robot state learns to produce appropriate output that additively modifies the initial control. The learning algorithm applied is actor critic with experience replay. In 50 min of learning, the slow initial gait changes to a dexterous and fast walking. No model of the robot dynamics is engaged. The methodology in use is generic and can be applied to optimize control systems for diverse robots of comparable complexity.
引用
收藏
页数:21
相关论文
共 50 条
[41]   Deep Reinforcement Learning for Autonomous Model-Free Navigation with Partial Observability [J].
Tapia, Daniel ;
Parras, Juan ;
Zazo, Santiago .
2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
[42]   Model-free off-policy reinforcement learning in continuous environment [J].
Wawrzynski, P ;
Pacut, A .
2004 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2004, :1091-1096
[43]   Exploring the Potential of Model-Free Reinforcement Learning using Tsetlin Machines [J].
Drosdal, Didrik K. ;
Grimsmo, Andreas ;
Andersen, Per-Arne ;
Granmo, Ole-Christoffer ;
Goodwin, Morten .
2023 INTERNATIONAL SYMPOSIUM ON THE TSETLIN MACHINE, ISTM, 2023,
[44]   Model-free Reinforcement Learning with Stochastic Reward Stabilization for Recommender Systems [J].
Cai, Tianchi ;
Bao, Shenliao ;
Jiang, Jiyan ;
Zhou, Shiji ;
Zhang, Wenpeng ;
Gu, Lihong ;
Gu, Jinjie ;
Zhang, Guannan .
PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, :2179-2183
[45]   Missile Evasion Maneuver Generation with Model-free Deep Reinforcement Learning [J].
Ozbek, Muhammed Murat ;
Koyuncu, Emre .
2023 10TH INTERNATIONAL CONFERENCE ON RECENT ADVANCES IN AIR AND SPACE TECHNOLOGIES, RAST, 2023,
[46]   Model-Free Reinforcement Learning based Lateral Control for Lane Keeping [J].
Zhang, Qichao ;
Luo, Rui ;
Zhao, Dongbin ;
Luo, Chaomin ;
Qian, Dianwei .
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[47]   Optimization of a Compact Model for the Compliant Humanoid Robot COMAN Using Reinforcement Learning [J].
Colasanto, Luca ;
Kormushev, Petar ;
Tsagarakis, Nikolaos ;
Caldwell, Darwin G. .
CYBERNETICS AND INFORMATION TECHNOLOGIES, 2012, 12 (03) :76-85
[48]   Safety Probability Estimation in Dimension Reduction Space for Model-Free Safe Reinforcement Learning of Robotics [J].
Yu, Jianlan ;
Liu, Qingchen ;
Qin, Jiahu ;
Han, Ruitian ;
Yan, Chengzhen .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (06) :6312-6319
[49]   Solving Complex Manipulation Tasks with Model-Assisted Model-Free Reinforcement Learning [J].
Hu, Jianshu ;
Weng, Paul .
CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 :1299-1308
[50]   Variability in Dopamine Genes Dissociates Model-Based and Model-Free Reinforcement Learning [J].
Doll, Bradley B. ;
Bath, Kevin G. ;
Daw, Nathaniel D. ;
Frank, Michael J. .
JOURNAL OF NEUROSCIENCE, 2016, 36 (04) :1211-1222