Gait Balance and Acceleration of a Biped Robot Based on Q-Learning

被引:37
|
作者
Lin, Jin-Ling [1 ]
Hwang, Kao-Shing [2 ]
Jiang, Wei-Cheng [2 ]
Chen, Yu-Jen [3 ]
机构
[1] Shih Hsin Univ, Dept Informat Management, Taipei 116, Taiwan
[2] Natl Sun Yat Sen Univ, Dept Elect Engn, Kaohsiung 80424, Taiwan
[3] Natl Chung Cheng Univ, Dept Elect Engn, Chiayi 62102, Taiwan
来源
IEEE ACCESS | 2016年 / 4卷
关键词
Reinforcement learning; biped robot; continuous action space; zero moment point; ALGORITHM;
D O I
10.1109/ACCESS.2016.2570255
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a method for the biped dynamic walking and balance control using reinforcement learning, which learns dynamic walking without a priori knowledge about the dynamic model. The learning architecture developed is aimed to solve complex control problems in robotic actuation control by mapping the action space from a discretized domain to a continuous one. It employs the discrete actions to construct a policy for continuous action. The architecture allows for the scaling of the dimensionality of the state space and cardinality of the action set that represents new knowledge, or new requirements for a desired task. The balance learning method utilizing the motion of robot arm and leg to shift the zero moment point on the soles of a robot can maintain the biped robot in a static stable state. This balanced algorithm is applied to biped walking on a flat surface and a seesaw and is making the biped's walks more stable. The simulation shows that the proposed method can allow the robot to learn to improve its behavior in terms of walking speed. Finally, the methods are implemented on a physical biped robot to demonstrate the feasibility and effectiveness of the proposed learning scheme.
引用
收藏
页码:2439 / 2449
页数:11
相关论文
共 50 条
  • [1] Gait Balance of Biped Robot based on Reinforcement Learning
    Hwang, Kao-Shing
    Li, Jhe-Syun
    Jiang, Wei-Cheng
    Wang, Wei-Han
    2013 PROCEEDINGS OF SICE ANNUAL CONFERENCE (SICE), 2013, : 435 - 439
  • [2] Footstep planning for biped robot based on fuzzy Q-learning approach
    Sabourin, Christophe
    Madani, Kurosh
    Yu, Weiwei
    Yan, Jie
    ICINCO 2008: PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS, VOL RA-1: ROBOTICS AND AUTOMATION, VOL 1, 2008, : 183 - +
  • [3] Balance Control of Robot With CMAC Based Q-learning
    Li Ming-ai
    Jiao Li-fang
    Qiao Jun-fei
    Ruan Xiao-gang
    2008 CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-11, 2008, : 2668 - 2672
  • [4] A simple trajectory optimization method with Q-learning for biped gait
    Hu, LY
    Sun, ZQ
    DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2005, 1 : 329 - 332
  • [5] Estimating probability distribution with Q-learning for biped gait generation and optimization
    Hu, Lingyun
    Zhou, Changjiu
    Sun, Zengqi
    2006 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-12, 2006, : 362 - +
  • [6] Estimating biped gait using spline-based probability distribution function with Q-learning
    Hu, Lingyun
    Zhou, Changjiu
    Sun, Zengqi
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2008, 55 (03) : 1444 - 1452
  • [7] A Surrogate Model based Gait Learning for Biped Robot
    Luo, Dingsheng
    Wang, Yi
    Wu, Xihong
    ADVANCES IN MECHATRONICS AND CONTROL ENGINEERING II, PTS 1-3, 2013, 433-435 : 138 - 145
  • [8] Q-Learning of Straightforward Gait Pattern for Humanoid Robot Based on Automatic Training Platform
    Wong, Ching-Chang
    Liu, Chih-Cheng
    Xiao, Sheng-Ru
    Yang, Hao-Yu
    Lau, Meng-Cheng
    ELECTRONICS, 2019, 8 (06)
  • [9] Asymptotically stable gait generation for biped robot based on mechanical energy balance
    Asano, Fumihiko
    Luo, Zhi-Wei
    2007 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-9, 2007, : 3333 - 3339
  • [10] Modeling and fuzzy Q-learning control of biped walking
    Meng Joo Er
    Yi Zhou
    Proceedings of the 24th Chinese Control Conference, Vols 1 and 2, 2005, : 641 - 646