A Self-Adaptive Double Q-Backstepping Trajectory Tracking Control Approach Based on Reinforcement Learning for Mobile Robots

被引：5

作者：

He, Naifeng ^{[1
]}

Yang, Zhong ^{[1
]}

Fan, Xiaoliang ^{[2
]}

Wu, Jiying ^{[1
]}

Sui, Yaoyu ^{[1
]}

Zhang, Qiuyan ^{[3
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Automat Engn, Nanjing 211106, Peoples R China

[2] Chinese Acad Sci, State Key Lab Robot, Shenyang Inst Automat, Shenyang 110017, Peoples R China

[3] Guizhou Power Grid Co Ltd, Elect Power Res Inst, Guiyang 550002, Peoples R China

来源：

ACTUATORS | 2023年 / 12卷 / 08期

关键词：

reinforcement learning; double Q-backstepping control; mobile robot; trajectory tracking control;

D O I：

10.3390/act12080326

中图分类号：

TH [机械、仪表工业];

学科分类号：

0802 ;

摘要：

When a mobile robot inspects tasks with complex requirements indoors, the traditional backstepping method cannot guarantee the accuracy of the trajectory, leading to problems such as the instrument not being inside the image and focus failure when the robot grabs the image with high zoom. In order to solve this problem, this paper proposes an adaptive backstepping method based on double Q-learning for tracking and controlling the trajectory of mobile robots. We design the incremental model-free algorithm of Double-Q learning, which can quickly learn to rectify the trajectory tracking controller gain online. For the controller gain rectification problem in non-uniform state space exploration, we propose an incremental active learning exploration algorithm that incorporates memory playback as well as experience playback mechanisms to achieve online fast learning and controller gain rectification for agents. To verify the feasibility of the algorithm, we perform algorithm verification on different types of trajectories in Gazebo and physical platforms. The results show that the adaptive trajectory tracking control algorithm can be used to rectify the mobile robot trajectory tracking controller's gain. Compared with the Backstepping-Fractional-Older PID controller and Fuzzy-Backstepping controller, Double Q-backstepping has better robustness, generalization, real-time, and stronger anti-disturbance capability.

引用

页数：24

共 51 条

[1] A Novel Hybrid Path Planning Method Based on Q-Learning and Neural Network for Robot Arm
Abdi, Ali
Adhikari, Dibash
Park, Ju Hong
[J]. APPLIED SCIENCES-BASEL, 2021, 11 (15):
[2] Unified reinforcement Q-learning for mean field game and control problems
Angiuli, Andrea
Fouque, Jean-Pierre
Lauriere, Mathieu
[J]. MATHEMATICS OF CONTROL SIGNALS AND SYSTEMS, 2022, 34 (02) : 217 - 271
[3] Bengio Y., 2007, Large-scale kernel machines, V34, P1, DOI DOI 10.7551/MITPRESS/7496.003.0016
[4] Double Q-PID algorithm for mobile robot control
Carlucho, Ignacio
De Paula, Mariano
Acosta, Gerardo G.
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 137 : 292 - 307
[5] Incremental Q-learning strategy for adaptive PID control of mobile robots
Carlucho, Ignacio
De Paula, Mariano
Villar, Sebastian A.
Acosta, Gerardo G.
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2017, 80 : 183 - 199
[6] Improving the Robustness of Reinforcement Learning Policies With L1 Adaptive Control
Cheng, Yikun
Zhao, Pan
Wang, Fanxin
Block, Daniel J.
Hovakimyan, Naira
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (03) : 6574 - 6581
[7] Robust output feedback control for the trajectory tracking of robotic wheelchairs
Chocoteco, J. A.
Morales, R.
Feliu, V.
Sira-Ramirez, H.
[J]. ROBOTICA, 2015, 33 (01) : 41 - 59
[8] Dumitrascu B., 2011, P 2011 15 INT C SYST
[9] Fierro R, 1997, J ROBOTIC SYST, V14, P149, DOI 10.1002/(SICI)1097-4563(199703)14:3<149::AID-ROB1>3.0.CO
[10] 2-R

← 1 2 3 4 5 6 →