Optimizing Reinforcement Learning Control Model in Furuta Pendulum and Transferring it to Real-World

被引：1

作者：

Hong, Myung Rae ^{[1
]}

Kang, Sanghun ^{[1
]}

Lee, Jingoo ^{[2
]}

Seo, Sungchul ^{[3
]}

Han, Seungyong ^{[1
]}

Koh, Je-Sung ^{[1
]}

Kang, Daeshik ^{[1
]}

机构：

[1] Ajou Univ, Dept Mech Engn, Multiscale Bioinspired Technol Lab, Suwon 16499, South Korea

[2] Korea Inst Machinery ad Mat, Dept Sustainable Environm Res, Multiscale Bioinspired Technol Lab, Daejeon 34103, South Korea

[3] Seokyeong Univ, Dept Nanochem Biol & Environm Engn, Seoul 02713, South Korea

来源：

IEEE ACCESS | 2023年 / 11卷

基金：

新加坡国家研究基金会;

关键词：

Furuta pendulum; inverted pendulum problem; reward design; reinforcement learning; Sim2Real;

D O I：

10.1109/ACCESS.2023.3310405

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Reinforcement learning does not require explicit robot modeling as it learns on its own based on data, but it has temporal and spatial constraints when transferred to real-world environments. In this research, we trained a balancing Furuta pendulum problem, which is difficult to model, in a virtual environment (Unity) and transferred it to the real world. The challenge of the balancing Furuta pendulum problem is to maintain the pendulum's end effector in a vertical position. We resolved the temporal and spatial constraints by performing reinforcement learning in a virtual environment. Furthermore, we designed a novel reward function that enabled faster and more stable problem-solving compared to the two existing reward functions. We validate each reward function by applying it to the soft actor-critic (SAC) and proximal policy optimization (PPO). The experimental result shows that cosine reward function is trained faster and more stable. Finally, SAC algorithm model using a cosine reward function in the virtual environment is an optimized controller. Additionally, we evaluated the robustness of this model by transferring it to the real environment.

引用

页码：95195 / 95200

页数：6

共 50 条

[21] Using Reinforcement Learning for Optimizing Heat Pump Control in a Building Model in Modelica
Peirelinck, Thijs
Ruelens, Frederik
Decnoninck, Geert
2018 IEEE INTERNATIONAL ENERGY CONFERENCE (ENERGYCON), 2018,
[22] Control parallel double inverted pendulum by hierarchical reinforcement learning
Zheng, Y
Luo, SW
Lv, Z
Wu, LN
2004 7TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS 1-3, 2004, : 1614 - 1617
[23] Reinforcement Learning Compensation based PD Control for Inverted Pendulum
Puriel-Gil, Guillermo
Yu, Wen
Sossa, Humberto
2018 15TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, COMPUTING SCIENCE AND AUTOMATIC CONTROL (CCE), 2018,
[24] Real-world challenges for multi-agent reinforcement learning in grid-interactive buildings
Nweye, Kingsley
Liu, Bo
Stone, Peter
Nagy, Zoltan
ENERGY AND AI, 2022, 10
[25] Optimizing HP Model Using Reinforcement Learning
Yang, Ru
Wu, Hongjie
Fu, Qiming
Ding, Tao
Chen, Cheng
INTELLIGENT COMPUTING THEORIES AND APPLICATION, PT II, 2018, 10955 : 383 - 388
[26] Transition Control of a Double-Inverted Pendulum System Using Sim2Real Reinforcement Learning
Lee, Taegun
Ju, Doyoon
Lee, Young Sam
MACHINES, 2025, 13 (03)
[27] End-to-End Active Object Tracking and Its Real-World Deployment via Reinforcement Learning
Luo, Wenhan
Sun, Peng
Zhong, Fangwei
Liu, Wei
Zhang, Tong
Wang, Yizhou
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (06) : 1317 - 1332
[28] Visual Navigation in Real-World Indoor Environments Using End-to-End Deep Reinforcement Learning
Kulhanek, Jonas
Derner, Erik
Babuska, Robert
IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (03) : 4345 - 4352
[29] A Real-World Reinforcement Learning Framework for Safe and Human-Like Tactical Decision-Making
Yavas, Muharrem Ugur
Kumbasar, Tufan
Ure, Nazim Kemal
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (11) : 11773 - 11784
[30] Cost-Effective Autonomous Drone Navigation Using Reinforcement Learning: Simulation and Real-World Validation
Czarnecki, Tomasz
Stawowy, Marek
Kadlubowski, Adam
APPLIED SCIENCES-BASEL, 2025, 15 (01):

← 1 2 3 4 5 →