A path following controller for deep-sea mining vehicles considering slip control and random resistance based on improved deep deterministic policy gradient

被引：10

作者：

Chen, Qihang ^{[1
,2
,3
]}

Yang, Jianmin ^{[1
,2
,3
]}

Mao, Jinghang ^{[1
,2
]}

Liang, Zhixuan ^{[4
]}

Lu, Changyu ^{[1
,2
,3
]}

Sun, Pengfei ^{[1
,2
,3
]}

机构：

[1] Shanghai Jiao Tong Univ, State Key Lab Ocean Engn, Shanghai 200240, Peoples R China

[2] Shanghai Jiao Tong Univ, Yazhou Bay Inst Deepsea SCI TECH, Sanya 572024, Hainan, Peoples R China

[3] Shanghai Jiao Tong Univ, Inst Marine Equipment, Shanghai 200240, Peoples R China

[4] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China

来源：

OCEAN ENGINEERING | 2023年 / 278卷

关键词：

Deep-sea mining vehicle; Path following; Improved deep deterministic policy gradient; Slip control; Deep reinforcement learning; GAME; GO;

D O I：

10.1016/j.oceaneng.2023.114069

中图分类号：

U6 [水路运输]; P75 [海洋工程];

学科分类号：

0814 ; 081505 ; 0824 ; 082401 ;

摘要：

This study aimed to develop a deep-sea mining vehicle (DSMV) path-following controller that could better reflect the actual deep-sea mining conditions. First, the dynamic model of the DSMV was improved. By introducing a nonlinear slip-control model and random environmental noise resistance, the controlled plant was developed to be closer to the actual mining operation condition. Second, an improved deep deterministic policy gradient (IDDPG) algorithm was proposed. Compared to the standard DDPG algorithm, the improved algorithm reduces the overestimation of the Q value and enhances the ability of an agent to explore the global optimum. A warm-up stage was introduced to improve stability at the beginning of training and accelerate the convergence speed of training. Third, a general reward function was designed for this type of problem. Combined with the uncertainty of the improved model, the generalization ability and adaptability to the unknown environment of the controller could be improved. Finally, through a random one-point-following training test in the simulation environment and different path-following comparison tests, the path-following control ability of the controller was verified.

引用

页数：19

共 29 条

[1] Ahmad M., 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065), P2938, DOI 10.1109/ROBOT.2000.846474
[2] Chen Q., 2022, 32 INT OCEAN POLAR
[3] Nonlinear Multi-Body Dynamic Modeling and Coordinated Motion Control Simulation of Deep-Sea Mining System
Dai, Yu
Yin, Wanwu
Ma, Feiyue
[J]. IEEE ACCESS, 2019, 7 : 86242 - 86251
[4] Direct and indirect adaptive integral line-of-sight path-following controllers for marine craft exposed to ocean currents
Fossen, Thor I.
Lekkas, Anastasios M.
[J]. INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2017, 31 (04) : 445 - 463
[5] Fujimoto S, 2018, PR MACH LEARN RES, V80
[6] Heess NTBD., 2017, ARXIV
[7] Reinforcement learning: A survey
Kaelbling, LP
Littman, ML
Moore, AW
[J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1996, 4 : 237 - 285
[8] Path Planning for UAV Ground Target Tracking via Deep Reinforcement Learning
Li, Bohao
Wu, Yunjie
[J]. IEEE ACCESS, 2020, 8 (29064-29074) : 29064 - 29074
[9] Three-Dimensional Path Following of an Underactuated AUV Based on Fuzzy Backstepping Sliding Mode Control
Liang, Xiao
Qu, Xingru
Wan, Lei
Ma, Qiang
[J]. INTERNATIONAL JOURNAL OF FUZZY SYSTEMS, 2018, 20 (02) : 640 - 649
[10] Hierarchical Reinforcement Learning with Opponent Modeling for Distributed Multi-agent Cooperation
Liang, Zhixuan
Cao, Jiannong
Jiang, Shan
Saxena, Divya
Xu, Huafeng
[J]. 2022 IEEE 42ND INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2022), 2022, : 884 - 894

← 1 2 3 →