Reinforcement Learning with Experience Replay for Model-Free Humanoid Walking Optimization

被引:8
作者
Wawrzynski, Pawel [1 ]
机构
[1] Warsaw Univ Technol, Inst Control & Computat Engn, PL-00665 Warsaw, Poland
关键词
Reinforcement learning; learning in robots; humanoids; bipedal walking; CONVERGENCE; ROBOTS;
D O I
10.1142/S0219843614500248
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
In this paper, a control system for humanoid robot walking is approximately optimized by means of reinforcement learning. Given is a 18 DOF humanoid whose gait is based on replaying a simple trajectory. This trajectory is translated into a reactive policy. A neural network whose input represents the robot state learns to produce appropriate output that additively modifies the initial control. The learning algorithm applied is actor critic with experience replay. In 50 min of learning, the slow initial gait changes to a dexterous and fast walking. No model of the robot dynamics is engaged. The methodology in use is generic and can be applied to optimize control systems for diverse robots of comparable complexity.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] Model-free reinforcement learning from expert demonstrations: a survey
    Ramirez, Jorge
    Yu, Wen
    Perrusquia, Adolfo
    ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (04) : 3213 - 3241
  • [22] Model-free Control for Stratospheric Airship Based on Reinforcement Learning
    Nie, Chunyu
    Zhu, Ming
    Zheng, Zewei
    Wu, Zhe
    PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 10702 - 10707
  • [23] An Hybrid Model-Free Reinforcement Learning Approach for HVAC Control
    Solinas, Francesco M.
    Bellagarda, Andrea
    Macii, Enrico
    Patti, Edoardo
    Bottaccioli, Lorenzo
    2021 21ST IEEE INTERNATIONAL CONFERENCE ON ENVIRONMENT AND ELECTRICAL ENGINEERING AND 2021 5TH IEEE INDUSTRIAL AND COMMERCIAL POWER SYSTEMS EUROPE (EEEIC/I&CPS EUROPE), 2021,
  • [24] Enhanced Off-Policy Reinforcement Learning With Focused Experience Replay
    Kong, Seung-Hyun
    Nahrendra, I. Made Aswin
    Paek, Dong-Hee
    IEEE ACCESS, 2021, 9 (09): : 93152 - 93164
  • [25] Clustering experience replay for the effective exploitation in reinforcement learning
    Li, Min
    Huang, Tianyi
    Zhu, William
    PATTERN RECOGNITION, 2022, 131
  • [26] DDPG Reinforcement Learning Experiment for Improving the Stability of Bipedal Walking of Humanoid Robots
    Chun, Yeonghun
    Choi, Junghun
    Min, Injoon
    Ahn, Minsung
    Han, Jeakweon
    2023 IEEE/SICE INTERNATIONAL SYMPOSIUM ON SYSTEM INTEGRATION, SII, 2023,
  • [27] Model-Free Reinforcement Learning for Fully Cooperative Consensus Problem of Nonlinear Multiagent Systems
    Wang, Hong
    Li, Man
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (04) : 1482 - 1491
  • [28] Policy Gradient Adaptive Critic Designs for Model-Free Optimal Tracking Control With Experience Replay
    Lin, Mingduo
    Zhao, Bo
    Liu, Derong
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2022, 52 (06): : 3692 - 3703
  • [29] Hybrid control for combining model-based and model-free reinforcement learning
    Pinosky, Allison
    Abraham, Ian
    Broad, Alexander
    Argall, Brenna
    Murphey, Todd D.
    INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2023, 42 (06) : 337 - 355
  • [30] Model-free LQ Control for Unmanned Helicopters using Reinforcement Learning
    Lee, Dong Jin
    Bang, Hyochoong
    2011 11TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS), 2011, : 117 - 120