COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy

被引:5
|
作者
Wen, Naifeng [1 ]
Long, Yundong [1 ]
Zhang, Rubo [1 ]
Liu, Guanqun [1 ]
Wan, Wenjie [1 ]
Jiao, Dian [1 ]
机构
[1] Dalian Minzu Univ, Coll Mech & Elect Engn, Dalian 116600, Peoples R China
关键词
COLREGs; USV cooperative path planning; multi-agent proximal policy optimization; deep learning; target detection; COLLISION-AVOIDANCE;
D O I
10.3390/jmse11122334
中图分类号
U6 [水路运输]; P75 [海洋工程];
学科分类号
0814 ; 081505 ; 0824 ; 082401 ;
摘要
This research introduces a two-stage deep reinforcement learning approach for the cooperative path planning of unmanned surface vehicles (USVs). The method is designed to address cooperative collision-avoidance path planning while adhering to the International Regulations for Preventing Collisions at Sea (COLREGs) and considering the collision-avoidance problem within the USV fleet and between USVs and target ships (TSs). To achieve this, the study presents a dual COLREGs-compliant action-selection strategy to effectively manage the vessel-avoidance problem. Firstly, we construct a COLREGs-compliant action-evaluation network that utilizes a deep learning network trained on pre-recorded TS avoidance trajectories by USVs in compliance with COLREGs. Then, the COLREGs-compliant reward-function-based action-selection network is proposed by considering various TS encountering scenarios. Consequently, the results of the two networks are fused to select actions for cooperative path-planning processes. The path-planning model is established using the multi-agent proximal policy optimization (MAPPO) method. The action space, observation space, and reward function are tailored for the policy network. Additionally, a TS detection method is introduced to detect the motion intentions of TSs. The study conducted Monte Carlo simulations to demonstrate the strong performance of the planning method. Furthermore, experiments focusing on COLREGs-based TS avoidance were carried out to validate the feasibility of the approach. The proposed TS detection model exhibited robust performance within the defined task.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] Mobile Service Robot Path Planning Using Deep Reinforcement Learning
    Kumaar, A. A. Nippun
    Kochuvila, Sreeja
    IEEE ACCESS, 2023, 11 : 100083 - 100096
  • [22] Real Time Path Planning of Robot using Deep Reinforcement Learning
    Raajan, Jeevan
    Srihari, P., V
    Satya, Jayadev P.
    Bhikkaji, B.
    Pasumarthy, Ramkrishna
    IFAC PAPERSONLINE, 2020, 53 (02): : 15602 - 15607
  • [23] Multi-objective path planning based on deep reinforcement learning
    Xu, Jian
    Huang, Fei
    Cui, Yunfei
    Du, Xue
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 3273 - 3279
  • [24] Improved Robot Path Planning Method Based on Deep Reinforcement Learning
    Han, Huiyan
    Wang, Jiaqi
    Kuang, Liqun
    Han, Xie
    Xue, Hongxin
    SENSORS, 2023, 23 (12)
  • [25] Dynamic Scene Path Planning of UAVs Based on Deep Reinforcement Learning
    Tang, Jin
    Liang, Yangang
    Li, Kebo
    DRONES, 2024, 8 (02)
  • [26] AUV path planning based on improved IFDS and deep reinforcement learning
    Fan, Yiqun
    Li, Hongna
    Xie, Jiaqi
    Zhou, Yunfu
    INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2024, 21 (06):
  • [27] Path planning of robotic arm based on deep reinforcement learning algorithm
    Al-Gabalawy M.
    Advanced Control for Applications: Engineering and Industrial Systems, 2022, 4 (01):
  • [28] Ship path planning based on Deep Reinforcement Learning and weather forecast
    Artusi, Eva
    2021 22ND IEEE INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2021), 2021, : 258 - 260
  • [29] Robot Patrol Path Planning Based on Combined Deep Reinforcement Learning
    Li, Wenqi
    Chen, Dehua
    Le, Jiajin
    2018 IEEE INT CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, UBIQUITOUS COMPUTING & COMMUNICATIONS, BIG DATA & CLOUD COMPUTING, SOCIAL COMPUTING & NETWORKING, SUSTAINABLE COMPUTING & COMMUNICATIONS, 2018, : 659 - 666
  • [30] A path planning method based on deep reinforcement learning for crowd evacuation
    Meng X.
    Liu H.
    Li W.
    Journal of Ambient Intelligence and Humanized Computing, 2024, 15 (6) : 2925 - 2939