COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy

被引：5

作者：

Wen, Naifeng ^{[1
]}

Long, Yundong ^{[1
]}

Zhang, Rubo ^{[1
]}

Liu, Guanqun ^{[1
]}

Wan, Wenjie ^{[1
]}

Jiao, Dian ^{[1
]}

机构：

[1] Dalian Minzu Univ, Coll Mech & Elect Engn, Dalian 116600, Peoples R China

来源：

JOURNAL OF MARINE SCIENCE AND ENGINEERING | 2023年 / 11卷 / 12期

关键词：

COLREGs; USV cooperative path planning; multi-agent proximal policy optimization; deep learning; target detection; COLLISION-AVOIDANCE;

D O I：

10.3390/jmse11122334

中图分类号：

U6 [水路运输]; P75 [海洋工程];

学科分类号：

0814 ; 081505 ; 0824 ; 082401 ;

摘要：

This research introduces a two-stage deep reinforcement learning approach for the cooperative path planning of unmanned surface vehicles (USVs). The method is designed to address cooperative collision-avoidance path planning while adhering to the International Regulations for Preventing Collisions at Sea (COLREGs) and considering the collision-avoidance problem within the USV fleet and between USVs and target ships (TSs). To achieve this, the study presents a dual COLREGs-compliant action-selection strategy to effectively manage the vessel-avoidance problem. Firstly, we construct a COLREGs-compliant action-evaluation network that utilizes a deep learning network trained on pre-recorded TS avoidance trajectories by USVs in compliance with COLREGs. Then, the COLREGs-compliant reward-function-based action-selection network is proposed by considering various TS encountering scenarios. Consequently, the results of the two networks are fused to select actions for cooperative path-planning processes. The path-planning model is established using the multi-agent proximal policy optimization (MAPPO) method. The action space, observation space, and reward function are tailored for the policy network. Additionally, a TS detection method is introduced to detect the motion intentions of TSs. The study conducted Monte Carlo simulations to demonstrate the strong performance of the planning method. Furthermore, experiments focusing on COLREGs-based TS avoidance were carried out to validate the feasibility of the approach. The proposed TS detection model exhibited robust performance within the defined task.

引用

页数：21

共 50 条

[21] Mobile Service Robot Path Planning Using Deep Reinforcement Learning
Kumaar, A. A. Nippun
Kochuvila, Sreeja
IEEE ACCESS, 2023, 11 : 100083 - 100096
[22] Real Time Path Planning of Robot using Deep Reinforcement Learning
Raajan, Jeevan
Srihari, P., V
Satya, Jayadev P.
Bhikkaji, B.
Pasumarthy, Ramkrishna
IFAC PAPERSONLINE, 2020, 53 (02): : 15602 - 15607
[23] Multi-objective path planning based on deep reinforcement learning
Xu, Jian
Huang, Fei
Cui, Yunfei
Du, Xue
2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 3273 - 3279
[24] Improved Robot Path Planning Method Based on Deep Reinforcement Learning
Han, Huiyan
Wang, Jiaqi
Kuang, Liqun
Han, Xie
Xue, Hongxin
SENSORS, 2023, 23 (12)
[25] Dynamic Scene Path Planning of UAVs Based on Deep Reinforcement Learning
Tang, Jin
Liang, Yangang
Li, Kebo
DRONES, 2024, 8 (02)
[26] AUV path planning based on improved IFDS and deep reinforcement learning
Fan, Yiqun
Li, Hongna
Xie, Jiaqi
Zhou, Yunfu
INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2024, 21 (06):
[27] Path planning of robotic arm based on deep reinforcement learning algorithm
Al-Gabalawy M.
Advanced Control for Applications: Engineering and Industrial Systems, 2022, 4 (01):
[28] Ship path planning based on Deep Reinforcement Learning and weather forecast
Artusi, Eva
2021 22ND IEEE INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2021), 2021, : 258 - 260
[29] Robot Patrol Path Planning Based on Combined Deep Reinforcement Learning
Li, Wenqi
Chen, Dehua
Le, Jiajin
2018 IEEE INT CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, UBIQUITOUS COMPUTING & COMMUNICATIONS, BIG DATA & CLOUD COMPUTING, SOCIAL COMPUTING & NETWORKING, SUSTAINABLE COMPUTING & COMMUNICATIONS, 2018, : 659 - 666
[30] A path planning method based on deep reinforcement learning for crowd evacuation
Meng X.
Liu H.
Li W.
Journal of Ambient Intelligence and Humanized Computing, 2024, 15 (6) : 2925 - 2939

← 1 2 3 4 5 →