AM-RPPO Based Control Method for Biped Adaptive Locomotion

被引:0
|
作者
Ma L. [1 ]
Liu C. [1 ]
Lin L. [1 ]
Xu B. [1 ]
Chen Q. [1 ]
机构
[1] School of Electronics and Information Engineering, Tongji University, Shanghai
来源
Jiqiren/Robot | 2019年 / 41卷 / 06期
关键词
Adaptive biped locomotion; Attention mechanism; Deep reinforcement learning; Recurrent neural network;
D O I
10.13973/j.cnki.robot.180785
中图分类号
学科分类号
摘要
An AM-RPPO (attention mechanism-recurrent proximal policy optimization) based deep reinforcement learning (DRL) is proposed and applied to the adaptive locomotion control of biped robots. Firstly, the walking control problem in joint space for biped robots in unknown environment is modeled according to partially observable Markov decision process (POMDP). And the bias of estimation for the real state by DRL algorithm and proximal policy optimization (PPO) is illustrated. Next, the architecture of recurrent neural network (RNN) is introduced, and the forward propagation process of observation states in timing sequence environment by RNN is analyzed, which is different from multi-layer perceptrons. The RNN is embedded in the action generation network and the value function generation network respectively, and its advantages relative to the traditional neural networks are demonstrated. Thirdly, the attention mechanism (AM) widely used in many fields of deep learning, is introduced to obtain the states at different time steps and establish a weighted differentiation model of the final value function. Finally, the effectiveness of the proposed AM-RPPO algorithm for the locomotion control of biped robots with high-dimensional states is verified through simulation experiments. © 2019, Science Press. All right reserved.
引用
收藏
页码:731 / 741
页数:10
相关论文
共 39 条
  • [1] Mnih V., Kavukcuoglu K., Silver D., Et al., Human-level control through deep reinforcement learning, Nature, 518, 7540, pp. 529-533, (2015)
  • [2] Gibney E., Google AI algorithm masters ancient game of Go, Nature, 529, 7587, pp. 445-446, (2016)
  • [3] Nishigai K., Ito K., Control of multi-legged robot using reinforcement learning with body image and application to a real robot, IEEE International Conference on Robotics and Biomimetics, pp. 2511-2516, (2011)
  • [4] Erden M.S., Leblebiciolu K., Free gait generation with reinforcement learning for a six-legged robot, Robotics and Autonomous Systems, 56, 3, pp. 199-212, (2008)
  • [5] Rao J.H., An H.L., Zhang T.H., Et al., Single leg operational space control of quadruped robot based on reinforcement learning, IEEE Chinese Guidance, Navigation and Control Conference, pp. 597-602, (2017)
  • [6] Peng X.B., Berseth G., Van-De-Panne M., Dynamic terrain traversal skills using reinforcement learning, ACM Transactions on Graphics, 34, 4, (2015)
  • [7] Hwangbo J., Lee J., Dosovitskiy A., Et al., Learning agile and dynamic motor skills for legged robots, Science Robotics, 4, 26, (2019)
  • [8] Lee Y., Wampler K., Bernstein G., Et al., Motion fields for interactive character locomotion, ACM Transactions on Graphics, 29, 6, (2010)
  • [9] Liu C.J., Yang J., An K., Et al., Rhythmic-reflex hybrid adaptive walking control of biped robot, Journal of Intelligent and Robotic Systems, 94, 3-4, pp. 603-619, (2019)
  • [10] Di-Canio G., Stoyanov S., Balmori I.T., Et al., Adaptive combinatorial neural control for robust locomotion of a biped robot, 14th International Conference on the Simulation of Adaptive Behavior, pp. 317-328, (2016)