A Self-Adaptive Double Q-Backstepping Trajectory Tracking Control Approach Based on Reinforcement Learning for Mobile Robots

被引:5
作者
He, Naifeng [1 ]
Yang, Zhong [1 ]
Fan, Xiaoliang [2 ]
Wu, Jiying [1 ]
Sui, Yaoyu [1 ]
Zhang, Qiuyan [3 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Automat Engn, Nanjing 211106, Peoples R China
[2] Chinese Acad Sci, State Key Lab Robot, Shenyang Inst Automat, Shenyang 110017, Peoples R China
[3] Guizhou Power Grid Co Ltd, Elect Power Res Inst, Guiyang 550002, Peoples R China
关键词
reinforcement learning; double Q-backstepping control; mobile robot; trajectory tracking control;
D O I
10.3390/act12080326
中图分类号
TH [机械、仪表工业];
学科分类号
0802 ;
摘要
When a mobile robot inspects tasks with complex requirements indoors, the traditional backstepping method cannot guarantee the accuracy of the trajectory, leading to problems such as the instrument not being inside the image and focus failure when the robot grabs the image with high zoom. In order to solve this problem, this paper proposes an adaptive backstepping method based on double Q-learning for tracking and controlling the trajectory of mobile robots. We design the incremental model-free algorithm of Double-Q learning, which can quickly learn to rectify the trajectory tracking controller gain online. For the controller gain rectification problem in non-uniform state space exploration, we propose an incremental active learning exploration algorithm that incorporates memory playback as well as experience playback mechanisms to achieve online fast learning and controller gain rectification for agents. To verify the feasibility of the algorithm, we perform algorithm verification on different types of trajectories in Gazebo and physical platforms. The results show that the adaptive trajectory tracking control algorithm can be used to rectify the mobile robot trajectory tracking controller's gain. Compared with the Backstepping-Fractional-Older PID controller and Fuzzy-Backstepping controller, Double Q-backstepping has better robustness, generalization, real-time, and stronger anti-disturbance capability.
引用
收藏
页数:24
相关论文
共 51 条
  • [1] A Novel Hybrid Path Planning Method Based on Q-Learning and Neural Network for Robot Arm
    Abdi, Ali
    Adhikari, Dibash
    Park, Ju Hong
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (15):
  • [2] Unified reinforcement Q-learning for mean field game and control problems
    Angiuli, Andrea
    Fouque, Jean-Pierre
    Lauriere, Mathieu
    [J]. MATHEMATICS OF CONTROL SIGNALS AND SYSTEMS, 2022, 34 (02) : 217 - 271
  • [3] Bengio Y., 2007, Large-scale kernel machines, V34, P1, DOI DOI 10.7551/MITPRESS/7496.003.0016
  • [4] Double Q-PID algorithm for mobile robot control
    Carlucho, Ignacio
    De Paula, Mariano
    Acosta, Gerardo G.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 137 : 292 - 307
  • [5] Incremental Q-learning strategy for adaptive PID control of mobile robots
    Carlucho, Ignacio
    De Paula, Mariano
    Villar, Sebastian A.
    Acosta, Gerardo G.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2017, 80 : 183 - 199
  • [6] Improving the Robustness of Reinforcement Learning Policies With L1 Adaptive Control
    Cheng, Yikun
    Zhao, Pan
    Wang, Fanxin
    Block, Daniel J.
    Hovakimyan, Naira
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (03) : 6574 - 6581
  • [7] Robust output feedback control for the trajectory tracking of robotic wheelchairs
    Chocoteco, J. A.
    Morales, R.
    Feliu, V.
    Sira-Ramirez, H.
    [J]. ROBOTICA, 2015, 33 (01) : 41 - 59
  • [8] Dumitrascu B., 2011, P 2011 15 INT C SYST
  • [9] Fierro R, 1997, J ROBOTIC SYST, V14, P149, DOI 10.1002/(SICI)1097-4563(199703)14:3<149::AID-ROB1>3.0.CO
  • [10] 2-R