COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy

被引：5

作者：

Wen, Naifeng ^{[1
]}

Long, Yundong ^{[1
]}

Zhang, Rubo ^{[1
]}

Liu, Guanqun ^{[1
]}

Wan, Wenjie ^{[1
]}

Jiao, Dian ^{[1
]}

机构：

[1] Dalian Minzu Univ, Coll Mech & Elect Engn, Dalian 116600, Peoples R China

来源：

JOURNAL OF MARINE SCIENCE AND ENGINEERING | 2023年 / 11卷 / 12期

关键词：

COLREGs; USV cooperative path planning; multi-agent proximal policy optimization; deep learning; target detection; COLLISION-AVOIDANCE;

D O I：

10.3390/jmse11122334

中图分类号：

U6 [水路运输]; P75 [海洋工程];

学科分类号：

0814 ; 081505 ; 0824 ; 082401 ;

摘要：

This research introduces a two-stage deep reinforcement learning approach for the cooperative path planning of unmanned surface vehicles (USVs). The method is designed to address cooperative collision-avoidance path planning while adhering to the International Regulations for Preventing Collisions at Sea (COLREGs) and considering the collision-avoidance problem within the USV fleet and between USVs and target ships (TSs). To achieve this, the study presents a dual COLREGs-compliant action-selection strategy to effectively manage the vessel-avoidance problem. Firstly, we construct a COLREGs-compliant action-evaluation network that utilizes a deep learning network trained on pre-recorded TS avoidance trajectories by USVs in compliance with COLREGs. Then, the COLREGs-compliant reward-function-based action-selection network is proposed by considering various TS encountering scenarios. Consequently, the results of the two networks are fused to select actions for cooperative path-planning processes. The path-planning model is established using the multi-agent proximal policy optimization (MAPPO) method. The action space, observation space, and reward function are tailored for the policy network. Additionally, a TS detection method is introduced to detect the motion intentions of TSs. The study conducted Monte Carlo simulations to demonstrate the strong performance of the planning method. Furthermore, experiments focusing on COLREGs-based TS avoidance were carried out to validate the feasibility of the approach. The proposed TS detection model exhibited robust performance within the defined task.

引用

页数：21

共 50 条

[1] A COLREGs-based path-planning method for collision avoidance considering path cost through reinforcement learning
Song, Wanping
Chen, Zengqiang
Sun, Mingwei
Wang, Yongshuai
Sun, Qinglin
OCEAN ENGINEERING, 2025, 325
[2] COLREGS-based Path Planning for Ships at Sea Using Velocity Obstacles
Zhang, Wenjun
Yan, Chengyong
Lyu, Hongguang
Wang, Pinglin
Xue, Zongyao
Li, Zehua
Xiao, Bai
IEEE ACCESS, 2021, 9 : 32613 - 32626
[3] Path planning method for USVs based on improved DWA and COLREGs
Liu, Shiqi
Wang, Xingmin
Wu, Yang
Li, Qian
Yan, Jiuxiang
Levin, Eugene
INTELLIGENCE & ROBOTICS, 2024, 4 (04): : 385 - 405
[4] A path planning strategy unified with a COLREGS collision avoidance function based on deep reinforcement learning and artificial potential field
Li, Lingyu
Wu, Defeng
Huang, Youqiang
Yuan, Zhi-Ming
APPLIED OCEAN RESEARCH, 2021, 113
[5] COLREGs-abiding hybrid collision avoidance algorithm based on deep reinforcement learning for USVs
Xu, Xinli
Lu, Yu
Liu, Gang
Cai, Peng
Zhang, Weidong
OCEAN ENGINEERING, 2022, 247
[6] Intelligent collision avoidance algorithms for USVs via deep reinforcement learning under COLREGs
Xu, Xinli
Lu, Yu
Liu, Xiaocheng
Zhang, Weidong
OCEAN ENGINEERING, 2020, 217
[7] Path planning and dynamic collision avoidance algorithm under COLREGs via deep reinforcement learning
Xu, Xinli
Cai, Peng
Ahmed, Zahoor
Yellapu, Vidya Sagar
Zhang, Weidong
NEUROCOMPUTING, 2022, 468 : 181 - 197
[8] A Novel Reinforcement Learning Collision Avoidance Algorithm for USVs Based on Maneuvering Characteristics and COLREGs
Fan, Yunsheng
Sun, Zhe
Wang, Guofeng
SENSORS, 2022, 22 (06)
[9] Robot path planning based on deep reinforcement learning
Long, Yinxin
He, Huajin
2020 IEEE CONFERENCE ON TELECOMMUNICATIONS, OPTICS AND COMPUTER SCIENCE (TOCS), 2020, : 151 - 154
[10] Robot Path Planning Based on Deep Reinforcement Learning
Zhang, Rui
Jiang, Yuhao
Wu Fenghua
2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 1697 - 1701

← 1 2 3 4 5 →