A path following controller for deep-sea mining vehicles considering slip control and random resistance based on improved deep deterministic policy gradient

被引:10
作者
Chen, Qihang [1 ,2 ,3 ]
Yang, Jianmin [1 ,2 ,3 ]
Mao, Jinghang [1 ,2 ]
Liang, Zhixuan [4 ]
Lu, Changyu [1 ,2 ,3 ]
Sun, Pengfei [1 ,2 ,3 ]
机构
[1] Shanghai Jiao Tong Univ, State Key Lab Ocean Engn, Shanghai 200240, Peoples R China
[2] Shanghai Jiao Tong Univ, Yazhou Bay Inst Deepsea SCI TECH, Sanya 572024, Hainan, Peoples R China
[3] Shanghai Jiao Tong Univ, Inst Marine Equipment, Shanghai 200240, Peoples R China
[4] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China
关键词
Deep-sea mining vehicle; Path following; Improved deep deterministic policy gradient; Slip control; Deep reinforcement learning; GAME; GO;
D O I
10.1016/j.oceaneng.2023.114069
中图分类号
U6 [水路运输]; P75 [海洋工程];
学科分类号
0814 ; 081505 ; 0824 ; 082401 ;
摘要
This study aimed to develop a deep-sea mining vehicle (DSMV) path-following controller that could better reflect the actual deep-sea mining conditions. First, the dynamic model of the DSMV was improved. By introducing a nonlinear slip-control model and random environmental noise resistance, the controlled plant was developed to be closer to the actual mining operation condition. Second, an improved deep deterministic policy gradient (IDDPG) algorithm was proposed. Compared to the standard DDPG algorithm, the improved algorithm reduces the overestimation of the Q value and enhances the ability of an agent to explore the global optimum. A warm-up stage was introduced to improve stability at the beginning of training and accelerate the convergence speed of training. Third, a general reward function was designed for this type of problem. Combined with the uncertainty of the improved model, the generalization ability and adaptability to the unknown environment of the controller could be improved. Finally, through a random one-point-following training test in the simulation environment and different path-following comparison tests, the path-following control ability of the controller was verified.
引用
收藏
页数:19
相关论文
共 29 条
  • [1] Ahmad M., 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065), P2938, DOI 10.1109/ROBOT.2000.846474
  • [2] Chen Q., 2022, 32 INT OCEAN POLAR
  • [3] Nonlinear Multi-Body Dynamic Modeling and Coordinated Motion Control Simulation of Deep-Sea Mining System
    Dai, Yu
    Yin, Wanwu
    Ma, Feiyue
    [J]. IEEE ACCESS, 2019, 7 : 86242 - 86251
  • [4] Direct and indirect adaptive integral line-of-sight path-following controllers for marine craft exposed to ocean currents
    Fossen, Thor I.
    Lekkas, Anastasios M.
    [J]. INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2017, 31 (04) : 445 - 463
  • [5] Fujimoto S, 2018, PR MACH LEARN RES, V80
  • [6] Heess NTBD., 2017, ARXIV
  • [7] Reinforcement learning: A survey
    Kaelbling, LP
    Littman, ML
    Moore, AW
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1996, 4 : 237 - 285
  • [8] Path Planning for UAV Ground Target Tracking via Deep Reinforcement Learning
    Li, Bohao
    Wu, Yunjie
    [J]. IEEE ACCESS, 2020, 8 (29064-29074) : 29064 - 29074
  • [9] Three-Dimensional Path Following of an Underactuated AUV Based on Fuzzy Backstepping Sliding Mode Control
    Liang, Xiao
    Qu, Xingru
    Wan, Lei
    Ma, Qiang
    [J]. INTERNATIONAL JOURNAL OF FUZZY SYSTEMS, 2018, 20 (02) : 640 - 649
  • [10] Hierarchical Reinforcement Learning with Opponent Modeling for Distributed Multi-agent Cooperation
    Liang, Zhixuan
    Cao, Jiannong
    Jiang, Shan
    Saxena, Divya
    Xu, Huafeng
    [J]. 2022 IEEE 42ND INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2022), 2022, : 884 - 894