Trajectory Control of An Articulated Robot Based on Direct Reinforcement Learning

被引:5
作者
Tsai, Chia-Hao [1 ]
Lin, Jun-Ji [1 ]
Hsieh, Teng-Feng [1 ]
Yen, Jia-Yush [2 ]
机构
[1] Natl Taiwan Univ, Dept Mech Engn, 1,Sec 4,Roosevelt Rd, Taipei 106216, Taiwan
[2] Natl Taiwan Univ Sci & Technol, Dept Mech Engn, 43,Sec 4, Taipei 106335, Taiwan
关键词
robotics; trajectory following; reinforcement learning; model predictive control; MODEL-PREDICTIVE-CONTROL;
D O I
10.3390/robotics11050116
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Reinforcement Learning (RL) is gaining much research attention because it allows the system to learn from interacting with the environment. Yet, with all these successful applications, the application of RL in direct joint torque control without the help of an underlining dynamic model is not reported in the literature. This study presents a split network structure that enables successful training of RL to learn the direct torque control for trajectory following a six-axis articulated robot without prior knowledge of the dynamic robot model. The training took a very long time to converge. However, we were able to show the successful control of four different trajectories without needing an accurate dynamics model and complex inverse kinematics computation. To show the RL-based control's effectiveness, we also compare the RL control with the Model Predictive Control (MPC), another popular trajectory control method. Our results show that while the MPC achieves smoother and more accurate control, it does not automatically treat the singularity. In addition, it requires complex inverse dynamics calculations. On the other hand, the RL controller instinctively avoided the violent action around the singularities.
引用
收藏
页数:20
相关论文
共 39 条
[1]   Practical Model Predictive Control for a Class of Nonlinear Systems Using Linear Parameter-Varying Representations [J].
Abbas, Hossam S. ;
Cisneros, Pablo S. G. ;
Maennel, Georg ;
Rostalski, Philipp ;
Werner, Herbert .
IEEE ACCESS, 2021, 9 :62380-62393
[2]   An Improved N-Step Value Gradient Learning Adaptive Dynamic Programming Algorithm for Online Learning [J].
Al-Dabooni, Seaar ;
Wunsch, Donald C., II .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (04) :1155-1169
[3]  
Anschel O, 2017, PR MACH LEARN RES, P176
[4]   A New Soft Robot Control Method Using Model Predictive Control for a Pneumatically Actuated Humanoid [J].
Best, Charles M. ;
Gillespie, Morgan T. ;
Hyatt, Phillip ;
Rupert, Levi ;
Sherrod, Vallan ;
Killpack, Marc D. .
IEEE ROBOTICS & AUTOMATION MAGAZINE, 2016, 23 (03) :75-84
[5]  
Car M, 2018, IEEE INT C INT ROBOT, P6734, DOI 10.1109/IROS.2018.8593808
[6]  
Chen JY, 2018, IEEE INT VEH SYM, P1651, DOI 10.1109/IVS.2018.8500605
[7]   Industrial Robot Trajectory Tracking Control Using Multi-Layer Neural Networks Trained by Iterative Learning Control [J].
Chen, Shuyang ;
Wen, John T. .
ROBOTICS, 2021, 10 (01)
[8]  
De Asis K, 2018, AAAI CONF ARTIF INTE, P2902
[9]   Propagation of assembly errors in multitasking machines by the homogenous matrix method [J].
Diaz-Tena, E. ;
Ugalde, U. ;
Lopez de Lacalle, L. N. ;
de la Iglesia, A. ;
Calleja, A. ;
Campa, F. J. .
INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2013, 68 (1-4) :149-164
[10]  
Ersdal A.M., 2014, IFAC ProceedingsVolumes, V47, P981, DOI [10.3182/20140824-6-ZA-1003.01631, DOI 10.3182/20140824-6-ZA-1003.01631]