Robust Reference Signal Self-Organizing Control based on Deep Reinforcement Learning

被引：3

作者：

Iwasaki, Hiromichi ^{[1
]}

Okuyama, Atsushi ^{[2
]}

机构：

[1] Tokai Univ, Grad Sch Sci & Technol, 4-1-1 Kitakaname, Hiratsuka, Kanagawa 2591292, Japan

[2] Tokai Univ, Dept Precis Engn, 4-1-1 Kitakaname, Hiratsuka, Kanagawa 2591292, Japan

来源：

IEEJ JOURNAL OF INDUSTRY APPLICATIONS | 2022年 / 11卷 / 06期

关键词：

adaptive control; control theory based on machine learning; reference signal self-organization; deep reinforcement learning; track-following control; inverted pendulum with inertia rotor;

D O I：

10.1541/ieejjia.21005735

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In general, the effects of modeling errors, parameter variations, and external disturbances of the controlled object degrade tracking control performance. To address these problems, we have developed a control method based on deep reinforcement learning. Accordingly, a reference signal self-organizing control system based on a deep deterministic policy gradient (DDPG) is proposed, which is an extension of an existing control system using DDPG. In a previous study, we confirmed the realization of the swing-up and stabilizing motions of an inverted pendulum using the proposed control system (8). However, the addition of a new ability to the system could not be verified in that study. Thus, in this work, we aim to verify whether a new function can be added to the proposed control system. By performing a control simulation, we verified whether the proposed system can achieve robustness by using the inverted pendulum with an inertia rotor. A control simulation of the system was performed by adding noise into the system, and its control performance was investigated to confirm the robustness of the system. The simulation results indicate that the pendulum could not be inverted using an un-retrained control system. However, it was confirmed to have been inverted and stabilized by the swing-up and stabilizing control using a retrained control system. Moreover, the retrained control system could effectively function under the effect of noise with an accuracy close to that of a noise-free state. Therefore, we confirmed that the addition of robustness can be realized in the proposed control system.

引用

页码：737 / 743

页数：7

共 10 条

[1]

Egami T., 1995, Transactions of the Society of Instrument and Control Engineers, V31, P1618

[2]

Egami T., 1996, JOUNAL ROBOTICS SOC, V14, P406

[3] LINEAR-SYSTEMS WITH STATE AND CONTROL CONSTRAINTS - THE THEORY AND APPLICATION OF MAXIMAL OUTPUT ADMISSIBLE-SETS [J].

GILBERT, EG ;

TAN, KT .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1991, 36 (09) :1008-1020

[4]

Guo H., 2014, T ISCIE, V27, P187

[5]

Hirata K., 1999, Transactions of the Institute of Systems, Control and Information Engineers, V12, P586, DOI 10.5687/iscie.12.586

[6]

Iwasaki H., 2021, P INT C MECHATRONICS

[7]

Kingma DP, 2014, ADV NEUR IN, V27

[8]

Kobayashi M., 1994, J JAPAN SOC MECH ENG, V60, P144

[9]

Lillicrap T.P., 2019, arXiv

[10]

Okuyama A., 2014, IEEJ T IND APPL, V134, P667

← 1 →