A Self-Adaptive Double Q-Backstepping Trajectory Tracking Control Approach Based on Reinforcement Learning for Mobile Robots

被引：5

作者：

He, Naifeng ^{[1
]}

Yang, Zhong ^{[1
]}

Fan, Xiaoliang ^{[2
]}

Wu, Jiying ^{[1
]}

Sui, Yaoyu ^{[1
]}

Zhang, Qiuyan ^{[3
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Automat Engn, Nanjing 211106, Peoples R China

[2] Chinese Acad Sci, State Key Lab Robot, Shenyang Inst Automat, Shenyang 110017, Peoples R China

[3] Guizhou Power Grid Co Ltd, Elect Power Res Inst, Guiyang 550002, Peoples R China

来源：

ACTUATORS | 2023年 / 12卷 / 08期

关键词：

reinforcement learning; double Q-backstepping control; mobile robot; trajectory tracking control;

D O I：

10.3390/act12080326

中图分类号：

TH [机械、仪表工业];

学科分类号：

0802 ;

摘要：

When a mobile robot inspects tasks with complex requirements indoors, the traditional backstepping method cannot guarantee the accuracy of the trajectory, leading to problems such as the instrument not being inside the image and focus failure when the robot grabs the image with high zoom. In order to solve this problem, this paper proposes an adaptive backstepping method based on double Q-learning for tracking and controlling the trajectory of mobile robots. We design the incremental model-free algorithm of Double-Q learning, which can quickly learn to rectify the trajectory tracking controller gain online. For the controller gain rectification problem in non-uniform state space exploration, we propose an incremental active learning exploration algorithm that incorporates memory playback as well as experience playback mechanisms to achieve online fast learning and controller gain rectification for agents. To verify the feasibility of the algorithm, we perform algorithm verification on different types of trajectories in Gazebo and physical platforms. The results show that the adaptive trajectory tracking control algorithm can be used to rectify the mobile robot trajectory tracking controller's gain. Compared with the Backstepping-Fractional-Older PID controller and Fuzzy-Backstepping controller, Double Q-backstepping has better robustness, generalization, real-time, and stronger anti-disturbance capability.

引用

页数：24

共 51 条

[11] Haarnoja T, 2018, IEEE INT CONF ROBOT, P6244
[12] Hasselt H., 2010, P ADV NEURAL INFORM, V23
[13] Ibrahim M.M.S., 2021, J. Phys. Conf. Ser, V2128, P012018, DOI [10.1088/1742-6596/2128/1/012018, DOI 10.1088/1742-6596/2128/1/012018]
[14] Jamshidi Faezeh, 2021, 2021 7th International Conference on Web Research (ICWR), P82, DOI 10.1109/ICWR51868.2021.9443139
[15] Multi-objective approach for robot motion planning in search tasks
Jeddisaravi, Kossar
Alitappeh, Reza Javanmard
Pimenta, Luciano C. A.
Guimaraes, Frederico G.
[J]. APPLIED INTELLIGENCE, 2016, 45 (02) : 305 - 321
[16] Kanayama Y., 1991, Proceedings IROS '91. IEEE/RSJ International Workshop on Intelligent Robots and Systems '91. Intelligence for Mechanical Systems (Cat. No.91TH0375-6), P1236, DOI 10.1109/IROS.1991.174669
[17] A Double Q-Learning Approach for Navigation of Aerial Vehicles with Connectivity Constraint
Khamidehi, Behzad
Sousa, Elvino S.
[J]. ICC 2020 - 2020 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2020,
[18] Khan Semab Neimat, 2021, Proceedings of 2021 International Conference on Artificial Intelligence (ICAI), P264, DOI 10.1109/ICAI52203.2021.9445200
[19] Reinforcement learning and optimal adaptive control: An overview and implementation examples
Khan, Said G.
Herrmann, Guido
Lewis, Frank L.
Pipe, Tony
Melhuish, Chris
[J]. ANNUAL REVIEWS IN CONTROL, 2012, 36 (01) : 42 - 59
[20] Kou Bo, 2021, Intelligent Equipment, Robots, and Vehicles: 7th International Conference on Life System Modeling and Simulation, LSMS 2021 and 7th International Conference on Intelligent Computing for Sustainable Energy and Environment, ICSEE 2021. Communications in Computer and Information Science (1469), P704, DOI 10.1007/978-981-16-7213-2_68

← 1 2 3 4 5 6 →