Reinforcement Learning DDPG-PPO Agent-Based Control System for Rotary Inverted Pendulum

被引：5

作者：

Bhourji, Rajmeet Singh ^{[1
]}

Mozaffari, Saeed ^{[1
]}

Alirezaee, Shahpour ^{[1
,2
]}

机构：

[1] Univ Windsor, Mech Automot & Mat Engn Dept, Windsor, ON, Canada

[2] Univ Windsor, Fac Engn, Windsor, ON, Canada

来源：

ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING | 2024年 / 49卷 / 02期

关键词：

Reinforcement learning; Deep deterministic policy gradient; Proximal policy optimization; Rotary inverted pendulum; Simulink;

D O I：

10.1007/s13369-023-07934-2

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

The rotary inverted pendulum (RIP) system is a nonlinear system used as a benchmark for testing control strategies. RIP system has a lot of applications in balancing of robotic systems such as drones and humanoid robots. Controlling RIP system is a complex task without concise knowledge of classic control engineering. This paper uses the reinforcement learning (RL) approach to control the RIP instead of classical controllers such as PID (proportional-integral-derivative) and LQR (linear-quadratic regulator). In this work, the deep deterministic policy gradient-proximal policy optimization (DDPG-PPO) agent is proposed and implemented to control the rotary inverted pendulum platform both in simulation and hardware. DDPG agent with 13 layers is trained for the swing-up action of the pendulum, and the mode selection process is trained and tested using the PPO agent. The rotary inverted pendulum is controlled using a proposed controller and compared with various RL agents such as soft actor critic-proximal policy optimization (SAC-PPO). Additionally, the proposed method is tested with a conventional proportional-integral-derivative (PID) controller, for different pendulum mass values, to validate its effectiveness. Finally, the proposed RL controller is implemented on the real-time RIP apparatus (Quanser Qube-Servo). Results show that DDPG-PPO RL agent is much effective than SAC-PPO agent during swing-up control.

引用

页码：1683 / 1696

页数：14

共 50 条

[1] Reinforcement Learning DDPG–PPO Agent-Based Control System for Rotary Inverted Pendulum
Rajmeet Singh Bhourji
Saeed Mozaffari
Shahpour Alirezaee
Arabian Journal for Science and Engineering, 2024, 49 : 1683 - 1696
[2] Modeling, Simulation, and Control of a Rotary Inverted Pendulum: A Reinforcement Learning-Based Control Approach
Hernandez, Ruben
Garcia-Hernandez, Ramon
Jurado, Francisco
MODELLING, 2024, 5 (04): : 1824 - 1852
[3] Imitation Reinforcement Learning-Based Remote Rotary Inverted Pendulum Control in OpenFlow Network
Kim, Ju-Bong
Lim, Hyun-Kyo
Kim, Chan-Myung
Kim, Min-Suk
Hong, Yong-Geun
Han, Youn-Hee
IEEE ACCESS, 2019, 7 : 36682 - 36690
[4] Vague neural network based reinforcement learning control system for inverted pendulum
Zhao, Yibiao
Luo, Siwei
Wang, Liang
Ma, Aidong
Fang, Rui
NEURAL INFORMATION PROCESSING, PT 3, PROCEEDINGS, 2006, 4234 : 692 - 701
[5] Robust Control of An Inverted Pendulum System Based on Policy Iteration in Reinforcement Learning
Ma, Yan
Xu, Dengguo
Huang, Jiashun
Li, Yahui
APPLIED SCIENCES-BASEL, 2023, 13 (24):
[6] Research on Control System of Rotary Inverted Pendulum Based on ARM
Liu, Jia
PROCEEDINGS OF 2014 INTERNATIONAL CONFERENCE ON MECHANICS AND MECHANICAL ENGINEERING, 2014, 684 : 381 - 385
[7] Reinforcement Learning Compensation based PD Control for Inverted Pendulum
Puriel-Gil, Guillermo
Yu, Wen
Sossa, Humberto
2018 15TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, COMPUTING SCIENCE AND AUTOMATIC CONTROL (CCE), 2018,
[8] Active exploration planning in reinforcement learning for inverted pendulum system control
Zheng, Yu
Luo, Si-Wei
Lv, Zi-Ang
PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2006, : 2805 - +
[9] Reinforcement Learning Compensation based PD Control for a Double Inverted Pendulum
Puriel-Gil, G.
Yu, W.
Sossa, H.
IEEE LATIN AMERICA TRANSACTIONS, 2019, 17 (02) : 323 - 329
[10] Generalized Predictive Control for Rotary Inverted Pendulum System
Gao, Qiang
Li, Yi
MECHANICAL AND ELECTRONICS ENGINEERING III, PTS 1-5, 2012, 130-134 : 4256 - +

← 1 2 3 4 5 →