A deep reinforcement learning algorithm to control a two-wheeled scooter with a humanoid robot

被引：12

作者：

Baltes, Jacky ^{[1
]}

Christmann, Guilherme ^{[1
]}

Saeedvand, Saeed ^{[1
]}

机构：

[1] Natl Taiwan Normal Univ, Dept Elect Engn, Taipei, Taiwan

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2023年 / 126卷

关键词：

Deep reinforcement learning; Proximal policy optimization (PPO); Two-wheeled vehicles; PID control; Humanoid robotics;

D O I：

10.1016/j.engappai.2023.106941

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Balancing a two-wheeled scooter is considered a challenging task for robots, as it is a non-linear control problem in a highly dynamic environment. The rapid pace of development of deep reinforcement learning has enabled robots to perform complex control tasks. In this paper, a deep reinforcement learning algorithm is proposed to learn the steering control of the scooter for balancing and patch tracking using an unmodified humanoid robot. Two control strategies are developed, analyzed, and compared: a classical Proportional- Integral-Derivative (PID) controller and a Deep Reinforcement Learning (DRL) controller based on Proximal Policy Optimization (PPO) algorithm. The ability of the robot to balance the scooter using both approaches is extensively evaluated. Challenging control scenarios are tested at low scooter speeds, including 2.5, 5, and 10 km/h. Steering velocities are also varied, including 10, 20, and 40 rad/s. The evaluations include upright balance without disturbances, upright balance under disturbances, tracking sinusoidal path, and path tracking. A 3D model of the humanoid robot and scooter system is developed, which is simulated in a state-of-the-art GPU-based simulation environment as a training and test bed (NVidia's Isaac Gym). Despite the fact that the PID controller successfully balances the robot, better final results are achieved with the proposed DRL. The results indicate a 52% improvement on average in different speeds with better performance in path tracking control. Controller command evaluation on the real robot and scooter indicates the robot's complete capability to realize steering control velocities.

引用

页数：20

共 45 条

[1] Advanced metaheuristic optimization techniques in applications of deep neural networks: a review [J].

Abd Elaziz, Mohamed ;

Dahou, Abdelghani ;

Abualigah, Laith ;

Yu, Liyang ;

Alshinwan, Mohammad ;

Khasawneh, Ahmad M. ;

Lu, Songfeng .

NEURAL COMPUTING & APPLICATIONS, 2021, 33 (21) :14079-14099

[2] Aquila Optimizer: A novel meta-heuristic optimization algorithm [J].

Abualigah, Laith ;

Yousri, Dalia ;

Abd Elaziz, Mohamed ;

Ewees, Ahmed A. ;

Al-qaness, Mohammed A. A. ;

Gandomi, Amir H. .

COMPUTERS & INDUSTRIAL ENGINEERING, 2021, 157 (157)

[3] The Arithmetic Optimization Algorithm [J].

Abualigah, Laith ;

Diabat, Ali ;

Mirjalili, Seyedali ;

Elaziz, Mohamed Abd ;

Gandomi, Amir H. .

COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2021, 376

[4] Gazelle optimization algorithm: a novel nature-inspired metaheuristic optimizer [J].

Agushaka, Jeffrey O. ;

Ezugwu, Absalom E. ;

Abualigah, Laith .

NEURAL COMPUTING & APPLICATIONS, 2023, 35 (05) :4099-4131

[5] Bicycle dynamics and control [J].

Åström, KJ ;

Klein, RE ;

Lennartsson, A .

IEEE CONTROL SYSTEMS MAGAZINE, 2005, 25 (04) :26-47

[6]

Baltes J., 2022, 2022 INT C ADV ROB I, P1

[7] Human Inspired Control of a Small Humanoid Robot in Highly Dynamic Environments or Jimmy Darwin Rocks the Bongo Board [J].

Baltes, Jacky ;

Iverach-Brereton, Chris ;

Anderson, John .

ROBOCUP 2014: ROBOT WORLD CUP XVIII, 2015, 8992 :466-477

[8]

Chun-Feng Huang, 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA), P5858, DOI 10.1109/ICRA.2017.7989689

[9] Balancing control of a bicycle-riding humanoid robot with center of gravity estimation [J].

Huang, Chun-Feng ;

Tung, Yen-Chun ;

Lu, Hao-Tien ;

Yeh, T. -J. .

ADVANCED ROBOTICS, 2018, 32 (17) :918-929

[10]

Iverach-Brereton Chris, 2012, Advances in Autonomous Robotics. Joint Proceedings of the 13th Annual TAROS Conference and the 15th Annual FIRA RoboWorld Congress, P209, DOI 10.1007/978-3-642-32527-4_19

← 1 2 3 4 5 →