Optimizing Reinforcement Learning Control Model in Furuta Pendulum and Transferring it to Real-World

被引:1
|
作者
Hong, Myung Rae [1 ]
Kang, Sanghun [1 ]
Lee, Jingoo [2 ]
Seo, Sungchul [3 ]
Han, Seungyong [1 ]
Koh, Je-Sung [1 ]
Kang, Daeshik [1 ]
机构
[1] Ajou Univ, Dept Mech Engn, Multiscale Bioinspired Technol Lab, Suwon 16499, South Korea
[2] Korea Inst Machinery ad Mat, Dept Sustainable Environm Res, Multiscale Bioinspired Technol Lab, Daejeon 34103, South Korea
[3] Seokyeong Univ, Dept Nanochem Biol & Environm Engn, Seoul 02713, South Korea
基金
新加坡国家研究基金会;
关键词
Furuta pendulum; inverted pendulum problem; reward design; reinforcement learning; Sim2Real;
D O I
10.1109/ACCESS.2023.3310405
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement learning does not require explicit robot modeling as it learns on its own based on data, but it has temporal and spatial constraints when transferred to real-world environments. In this research, we trained a balancing Furuta pendulum problem, which is difficult to model, in a virtual environment (Unity) and transferred it to the real world. The challenge of the balancing Furuta pendulum problem is to maintain the pendulum's end effector in a vertical position. We resolved the temporal and spatial constraints by performing reinforcement learning in a virtual environment. Furthermore, we designed a novel reward function that enabled faster and more stable problem-solving compared to the two existing reward functions. We validate each reward function by applying it to the soft actor-critic (SAC) and proximal policy optimization (PPO). The experimental result shows that cosine reward function is trained faster and more stable. Finally, SAC algorithm model using a cosine reward function in the virtual environment is an optimized controller. Additionally, we evaluated the robustness of this model by transferring it to the real environment.
引用
收藏
页码:95195 / 95200
页数:6
相关论文
共 50 条
  • [31] Reinforcement Learning Compensation based PD Control for a Double Inverted Pendulum
    Puriel-Gil, G.
    Yu, W.
    Sossa, H.
    IEEE LATIN AMERICA TRANSACTIONS, 2019, 17 (02) : 323 - 329
  • [32] Approximate neural optimal control with reinforcement learning for a torsional pendulum device
    Wang, Ding
    Qiao, Junfei
    NEURAL NETWORKS, 2019, 117 : 1 - 7
  • [33] Design of Reinforcement Learning Algorathm for Single Inverted Pendulum Swing Control
    Yue Chao
    Liu Yongxin
    Wang Linglin
    2018 CHINESE AUTOMATION CONGRESS (CAC), 2018, : 1558 - 1562
  • [34] Active exploration planning in reinforcement learning for inverted pendulum system control
    Zheng, Yu
    Luo, Si-Wei
    Lv, Zi-Ang
    PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2006, : 2805 - +
  • [35] VesNet-RL: Simulation-Based Reinforcement Learning for Real-World US Probe Navigation
    Bi, Yuan
    Jiang, Zhongliang
    Gao, Yuan
    Wendler, Thomas
    Karlas, Angelos
    Navab, Nassir
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (03) : 6638 - 6645
  • [36] A World Model for Actor–Critic in Reinforcement Learning
    A. I. Panov
    L. A. Ugadiarov
    Pattern Recognition and Image Analysis, 2023, 33 : 467 - 477
  • [37] Quantum Reinforcement Learning with Quantum World Model
    Zeng, Peigen
    He, Ying
    Yu, F. Richard
    Leung, Victor C. M.
    IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 3185 - 3190
  • [38] Self-optimizing adaptive optics control with Reinforcement Learning
    Landman, R.
    Haffert, S. Y.
    Radhakrishnan, V. M.
    Keller, C. U.
    ADAPTIVE OPTICS SYSTEMS VII, 2020, 11448
  • [39] Robust Control of An Inverted Pendulum System Based on Policy Iteration in Reinforcement Learning
    Ma, Yan
    Xu, Dengguo
    Huang, Jiashun
    Li, Yahui
    APPLIED SCIENCES-BASEL, 2023, 13 (24):
  • [40] Integration of Adaptive Control and Reinforcement Learning for Real-Time Control and Learning
    Annaswamy, Anuradha M.
    Guha, Anubhav
    Cui, Yingnan
    Tang, Sunbochen
    Fisher, Peter A.
    Gaudio, Joseph E.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (12) : 7740 - 7755