Deep reinforcement learning-based collision avoidance for an autonomous ship

被引：125

作者：

Chun, Do-Hyun ^{[1
]}

Roh, Myung-Il ^{[1
,2
]}

Lee, Hye-Won ^{[3
]}

Ha, Jisang ^{[1
]}

Yu, Donghun ^{[1
]}

机构：

[1] Seoul Natl Univ, Dept Naval Architecture & Ocean Engn, 1 Gwanak Ro, Seoul 08826, South Korea

[2] Seoul Natl Univ, Res Inst Marine Syst Engn, 1 Gwanak Ro, Seoul 08826, South Korea

[3] Seoul Natl Univ, Res Inst Marine Syst Engn, Seoul, South Korea

来源：

OCEAN ENGINEERING | 2021年 / 234卷

关键词：

Collision avoidance; Autonomous ship; Collision  risk; COLREGs; Deep  reinforcement learning; SIMULATION; BEHAVIOR;

D O I：

10.1016/j.oceaneng.2021.109216

中图分类号：

U6 [水路运输]; P75 [海洋工程];

学科分类号：

0814 ; 081505 ; 0824 ; 082401 ;

摘要：

Social interest in autonomous navigation systems for autonomous ships is also increasing. For a robust autonomous navigation system, the location, speed, and direction of the ship and other ships must be identified in real time, and collision avoidance should be performed at an appropriate time by considering the collision risk. In this study, we proposed a collision avoidance method that quantitatively assesses the collision risk and then generates an avoidance path. First, to assess the collision risk, a collision risk assessment method based on the ship domain and the closest point of approach (CPA) was proposed. The ship domain is created with an asymmetric shape considering manoeuvring performance and the COLREGs. The CPA is used to assess quantitative collision risk value. Subsequently, a path generation algorithm based on deep reinforcement learning (DRL) was proposed to determine the avoidance time and to generate an avoidance path complying the COLREGs for the most dangerous ship in terms of collision risk. The information of own ship and target ship such as location, speed, heading, collision risk is used as the input state, and the rudder angle of own ship is set as the output action of the DRL. The cost function related to the path following and the collision avoidance is defined as the reward of the DRLbased collision avoidance method. Additionally, the DRL modes are defined to navigate the flexible avoidance path by changing the ratio between the path following and the collision avoidance. To verify the proposed method, we compared the collision avoidance method with the A* algorithm, which is a traditional path planning algorithm, and analyzed the results for various scenarios. The proposed method reliably avoided collisions through flexible paths for complex and unexpected changes in situations compared to the A* algorithm.

引用

页数：20

共 46 条

[1] Nonlinear Model Predictive Control for trajectory tracking and collision avoidance of underactuated vessels with disturbances [J].

Abdelaal, Mohamed ;

Fraenzle, Martin ;

Hahn, Axel .

OCEAN ENGINEERING, 2018, 160 :168-180

[2] A study on the collision avoidance of a ship using neural networks and fuzzy logic [J].

Ahn, Jin-Hyeong ;

Rhee, Key-Pyo ;

You, Young-Jun .

APPLIED OCEAN RESEARCH, 2012, 37 :162-173

[3] Reinforcement Learning Based Obstacle Avoidance for Autonomous Underwater Vehicle [J].

Bhopale, Prashant ;

Kazi, Faruk ;

Singh, Navdeep .

JOURNAL OF MARINE SCIENCE AND APPLICATION, 2019, 18 (02) :228-238

[4] Concise deep reinforcement learning obstacle avoidance for underactuated unmanned marine vessels [J].

Cheng, Yin ;

Zhang, Weidong .

NEUROCOMPUTING, 2018, 272 :63-73

[5] MARINE TRAFFIC BEHAVIOR IN RESTRICTED WATERS [J].

COLDWELL, TG .

JOURNAL OF NAVIGATION, 1983, 36 (03) :430-444

[6]

Cui YD, 2019, IEEE INT C INT ROBOT, P2868, DOI [10.1109/IROS40897.2019.8967630, 10.1109/iros40897.2019.8967630]

[7] A COMPUTER-SIMULATION OF MARINE TRAFFIC USING DOMAINS AND ARENAS [J].

DAVIS, PV ;

DOVE, MJ ;

STOCKEL, CT .

JOURNAL OF NAVIGATION, 1980, 33 (02) :215-222

[8]

ertler M., 1967, NSRDC Report, P2510

[9]

Everett M, 2018, IEEE INT C INT ROBOT, P3052, DOI 10.1109/IROS.2018.8593871

[10] TRAFFIC CAPACITY [J].

FUJII, Y ;

TANAKA, K .

JOURNAL OF THE INSTITUTE OF NAVIGATION, 1971, 24 (04) :543-&

← 1 2 3 4 5 →