Understanding the stability of deep control policies for biped locomotion

被引:3
|
作者
Park, Hwangpil [1 ,3 ]
Yu, Ri [1 ]
Lee, Yoonsang [4 ]
Lee, Kyungho [5 ]
Lee, Jehee [2 ]
机构
[1] Seoul Natl Univ, Seoul, South Korea
[2] Seoul Natl Univ, Dept Comp Sci & Engn, Seoul, South Korea
[3] Samsung Elect, Suwon, South Korea
[4] Hanyang Univ, Comp Sci, Seoul, South Korea
[5] NC Soft, Sungnam, South Korea
来源
VISUAL COMPUTER | 2023年 / 39卷 / 01期
关键词
Biped locomotion; Deep reinforcement learning; Gait analysis; Physically based simulation; Push-recovery stability; RECOVERY;
D O I
10.1007/s00371-021-02342-9
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Achieving stability and robustness is the primary goal of biped locomotion control. Recently, deep reinforcement learning (DRL) has attracted great attention as a general methodology for constructing biped control policies and demonstrated significant improvements over the previous state-of-the-art control methods. Although deep control policies are more advantageous compared with previous controller design approaches, many questions remain: Are deep control policies as robust as human walking? Does simulated walking involve strategies similar to human walking for maintaining balance? Does a particular gait pattern affect human and simulated walking similarly? What do deep policies learn to achieve improved gait stability? The goal of this study is to address these questions by evaluating the push-recovery stability of deep policies compared with those of human subjects and a previous feedback controller. Furthermore, we conducted experiments to evaluate the effectiveness of variants of DRL algorithms.
引用
收藏
页码:473 / 487
页数:15
相关论文
共 50 条
  • [31] Planning biped locomotion using motion capture data and probabilistic roadmaps
    Choi, MG
    Lee, J
    Shin, SY
    ACM TRANSACTIONS ON GRAPHICS, 2003, 22 (02): : 182 - 203
  • [32] Learning CPG-based biped locomotion with a policy gradient method
    Matsubara, Takamitsu
    Morimoto, Jun
    Nakanishi, Jun
    Sato, Masa-aki
    Doya, Kenji
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2006, 54 (11) : 911 - 920
  • [33] Flexible Structure and Wheeled Feet to Simplify Biped Locomotion of Humanoid Robots
    Muscolo, Giovanni Gerardo
    Recchiuto, Carmine Tommaso
    INTERNATIONAL JOURNAL OF HUMANOID ROBOTICS, 2017, 14 (01)
  • [34] Adapting Biped Locomotion to Sloped EnvironmentsCombining Reinforcement Learning with Dynamical Systems
    João André
    Carlos Teixeira
    Cristina P. Santos
    Lino Costa
    Journal of Intelligent & Robotic Systems, 2015, 80 : 625 - 640
  • [35] CONTROLLING HUMAN-LIKE LOCOMOTION OF A BIPED BY A BIOLOGICALLY MOTIVATED APPROACH
    Zhao, J.
    Luksch, T.
    Berns, K.
    FIELD ROBOTICS, 2012, : 149 - 156
  • [36] Learning CPG-based biped locomotion with a policy gradient method
    Matsubara, T
    Morimoto, J
    Nakanishi, J
    Sato, M
    Doya, K
    2005 5TH IEEE-RAS INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS, 2005, : 208 - 213
  • [37] Deep reinforcement learning for modeling human locomotion control in neuromechanical simulation
    Seungmoon Song
    Łukasz Kidziński
    Xue Bin Peng
    Carmichael Ong
    Jennifer Hicks
    Sergey Levine
    Christopher G. Atkeson
    Scott L. Delp
    Journal of NeuroEngineering and Rehabilitation, 18
  • [38] Deep reinforcement learning for modeling human locomotion control in neuromechanical simulation
    Song, Seungmoon
    Kidzinski, Lukasz
    Bin Peng, Xue
    Ong, Carmichael
    Hicks, Jennifer
    Levine, Sergey
    Atkeson, Christopher G.
    Delp, Scott L.
    JOURNAL OF NEUROENGINEERING AND REHABILITATION, 2021, 18 (01)
  • [39] Adaptive Robot Biped Locomotion with Dynamic Motion Primitives and Coupled Phase Oscillators
    José Rosado
    Filipe Silva
    Vítor Santos
    António Amaro
    Journal of Intelligent & Robotic Systems, 2016, 83 : 375 - 391
  • [40] Adaptive Robot Biped Locomotion with Dynamic Motion Primitives and Coupled Phase Oscillators
    Rosado, Jose
    Silva, Filipe
    Santos, Vitor
    Amaro, Antonio
    JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2016, 83 (3-4) : 375 - 391