Deep reinforcement learning with intrinsic curiosity module based trajectory tracking control for USV

被引：9

作者：

Wu, Chuanbo ^{[1
]}

Yu, Wanneng ^{[1
,2
,3
]}

Liao, Weiqiang ^{[1
,2
,3
]}

Ou, Yanghangcheng ^{[1
]}

机构：

[1] Jimei Univ, Sch Marine Engn, Xiamen 361021, Peoples R China

[2] Fujian Prov Key Lab Naval Architecture & Ocean Eng, Xiamen 361021, Peoples R China

[3] Fujian Engn & Res Ctr, Offshore Small Green Intelligent Ship Syst, Xiamen 361021, Peoples R China

来源：

OCEAN ENGINEERING | 2024年 / 308卷

基金：

中国国家自然科学基金;

关键词：

Deep reinforcement learning; Intrinsic curiosity module; Trajectory tracking; USV;

D O I：

10.1016/j.oceaneng.2024.118342

中图分类号：

U6 [水路运输]; P75 [海洋工程];

学科分类号：

0814 ; 081505 ; 0824 ; 082401 ;

摘要：

Since unmanned surface vehicle (USV) systems are highly coupled and have nonlinear relationships, coupled with environmental disturbances from winds and currents, this makes it challenging to achieve accurate trajectory tracking of USVs by directly controlling the underlying parameters, such as rudder and rotational speed. Therefore, this paper proposes a proximal policy optimisation (PPO) control scheme based on intrinsic curiosity module (ICM). First, according to the training characteristics of deep reinforcement learning (DRL) algorithms, an improved guidance law is proposed, which can effectively solve the problem of the desired speed exceeding the maximum allowable speed caused by the large tracking error due to the random exploration of the USV at the early stage of training. Different from the traditional DRL methods, this method incorporates intrinsic rewards alongside extrinsic rewards from the training environment. These intrinsic rewards, generated by the intrinsic curiosity module, serve to incentivize the agent. Actively exploring unknown states and acquiring new knowledge can enhance training outcomes and prevent premature model convergence. Finally, tested in designing and constructing multiple tracking scenarios containing both simple and complex trajectories, the simulation results show that the ICM-PPO method performs well in the accurate trajectory tracking problem.

引用

页数：14

共 27 条

[1] Model predictive control with fuzzy logic switching for path tracking of autonomous vehicles [J].

Awad, Nada ;

Lasheen, Ahmed ;

Elnaggar, Mahmoud ;

Kamel, Ahmed .

ISA TRANSACTIONS, 2022, 129 :193-205

[2] Adaptive fixed-time backstepping control for three-dimensional trajectory tracking of underactuated autonomous underwater vehicles [J].

Chen, Hongxuan ;

Tang, Guoyuan ;

Wang, Shufeng ;

Guo, Wenxuan ;

Huang, Hui .

OCEAN ENGINEERING, 2023, 275

[3] Adaptive fuzzy tracking control for underactuated surface vessels with unmodeled dynamics and input saturation [J].

Deng, Yingjie ;

Zhang, Xianku ;

Im, Namkyun ;

Zhang, Guoqing ;

Zhang, Qiang .

ISA TRANSACTIONS, 2020, 103 :52-62

[4]

Dong Xiaoyu, 2023, 2023 2nd International Symposium on Control Engineering and Robotics (ISCER), P158, DOI 10.1109/ISCER58777.2023.00034

[5] Intelligent motion control of unmanned surface vehicles: A critical review [J].

Er, Meng Joo ;

Ma, Chuang ;

Liu, Tianhe ;

Gong, Huibin .

OCEAN ENGINEERING, 2023, 280

[6] Global fixed-time trajectory tracking control of underactuated USV based on fixed-time extended state observer [J].

Fan, Yunsheng ;

Qiu, Bingbing ;

Liu, Lei ;

Yang, Yu .

ISA TRANSACTIONS, 2023, 132 :267-277

[7] A new guidance law for trajectory tracking of an underactuated unmanned surface vehicle with parameter perturbations [J].

Huang, Haibin ;

Gong, Mian ;

Zhuang, Yufei ;

Sharma, Sanjay ;

Xu, Dianguo .

OCEAN ENGINEERING, 2019, 175 :217-222

[8] Adaptive output-feedback control with prescribed performance for trajectory tracking of underactuated surface vessels [J].

Jia, Zehua ;

Hu, Zhihuan ;

Zhang, Weidong .

ISA TRANSACTIONS, 2019, 95 :18-26

[9] Nonlinear Model Predictive Trajectory Tracking Control of Underactuated Marine Vehicles: Theory and Experiment [J].

Liang, Haojiao ;

Li, Huiping ;

Xu, Demin .

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2021, 68 (05) :4238-4248

[10] Robust fixed-time H8 tracking control of UUVs with partial and full state constraints and prescribed performance under input saturation [J].

Liu, Haitao ;

Qi, Zhenghong ;

Yuan, Jianbin ;

Tian, Xuehong .

OCEAN ENGINEERING, 2023, 283

← 1 2 3 →