Reinforcement Learning with Experience Replay for Model-Free Humanoid Walking Optimization

被引:8
作者
Wawrzynski, Pawel [1 ]
机构
[1] Warsaw Univ Technol, Inst Control & Computat Engn, PL-00665 Warsaw, Poland
关键词
Reinforcement learning; learning in robots; humanoids; bipedal walking; CONVERGENCE; ROBOTS;
D O I
10.1142/S0219843614500248
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
In this paper, a control system for humanoid robot walking is approximately optimized by means of reinforcement learning. Given is a 18 DOF humanoid whose gait is based on replaying a simple trajectory. This trajectory is translated into a reactive policy. A neural network whose input represents the robot state learns to produce appropriate output that additively modifies the initial control. The learning algorithm applied is actor critic with experience replay. In 50 min of learning, the slow initial gait changes to a dexterous and fast walking. No model of the robot dynamics is engaged. The methodology in use is generic and can be applied to optimize control systems for diverse robots of comparable complexity.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] Autonomous Reinforcement Learning with Experience Replay for Humanoid Gait Optimization
    Wawrzynski, Pawel
    PROCEEDINGS OF THE INTERNATIONAL NEURAL NETWORK SOCIETY WINTER CONFERENCE (INNS-WC2012), 2012, 13 : 205 - 211
  • [2] Model-Based and Model-Free Replay Mechanisms for Reinforcement Learning in Neurorobotics
    Massi, Elisa
    Barthelemy, Jeanne
    Mailly, Juliane
    Dromnelle, Remi
    Canitrot, Julien
    Poniatowski, Esther
    Girard, Benoit
    Khamassi, Mehdi
    FRONTIERS IN NEUROROBOTICS, 2022, 16
  • [3] Autonomous reinforcement learning with experience replay
    Wawrzynski, Pawel
    Tanwani, Ajay Kumar
    NEURAL NETWORKS, 2013, 41 : 156 - 167
  • [4] A Reinforcement Learning Method for Humanoid Robot Walking
    Liu, Yunda
    Bi, Sheng
    Dong, Min
    Zhang, Yingjie
    Huang, Jialing
    Zhang, Jiawei
    2018 IEEE 8TH ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (IEEE-CYBER), 2018, : 623 - 628
  • [5] Dyna-style Model-based reinforcement learning with Model-Free Policy Optimization
    Dong, Kun
    Luo, Yongle
    Wang, Yuxin
    Liu, Yu
    Qu, Chengeng
    Zhang, Qiang
    Cheng, Erkang
    Sun, Zhiyong
    Song, Bo
    KNOWLEDGE-BASED SYSTEMS, 2024, 287
  • [6] MFRLMO: Model-free reinforcement learning for multi-objective optimization of apache spark
    Ozturk, Muhammed Maruf
    EAI ENDORSED TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS, 2024, 11 (05): : 1 - 15
  • [7] Model-Free Safe Reinforcement Learning Through Neural Barrier Certificate
    Yang, Yujie
    Jiang, Yuxuan
    Liu, Yichen
    Chen, Jianyu
    Li, Shengbo Eben
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (03) : 1295 - 1302
  • [8] MODEL-FREE ONLINE REINFORCEMENT LEARNING OF A ROBOTIC MANIPULATOR
    Sweafford, Jerry, Jr.
    Fahimi, Farbod
    MECHATRONIC SYSTEMS AND CONTROL, 2019, 47 (03): : 136 - 143
  • [9] SELECTIVE EXPERIENCE REPLAY IN REINFORCEMENT LEARNING FOR REIDENTIFICATION
    Thakoor, Ninad
    Bhanu, Bir
    2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 4250 - 4254
  • [10] On Global Optimization of Walking Gaits for the Compliant Humanoid Robot, COMAN Using Reinforcement Learning
    Dallali, Houman
    Kormushev, Petar
    Li, Zhibin
    Caldwell, Darwin
    CYBERNETICS AND INFORMATION TECHNOLOGIES, 2012, 12 (03) : 39 - 52