Sim-to-Real: Mapless Navigation for USVs Using Deep Reinforcement Learning

被引:11
作者
Wang, Ning [1 ]
Wang, Yabiao [2 ,3 ,4 ]
Zhao, Yuming [2 ,3 ,4 ]
Wang, Yong [1 ]
Li, Zhigang [2 ,3 ,4 ]
机构
[1] Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei 230026, Peoples R China
[2] Chinese Acad Sci, Shenyang Inst Automat, Shenyang 110016, Peoples R China
[3] Chinese Acad Sci, Inst Robot, Shenyang 110169, Peoples R China
[4] Chinese Acad Sci, Inst Intelligent Mfg, Shenyang 110169, Peoples R China
关键词
deep reinforcement learning; mapless navigation; unmanned surface vehicle;
D O I
10.3390/jmse10070895
中图分类号
U6 [水路运输]; P75 [海洋工程];
学科分类号
0814 ; 081505 ; 0824 ; 082401 ;
摘要
In recent years, mapless navigation using deep reinforcement learning algorithms has shown significant advantages in improving robot motion planning capabilities. However, the majority of past works have focused on aerial and ground robotics, with very little attention being paid to unmanned surface vehicle (USV) navigation and ultimate deployment on real platforms. In response, this paper proposes a mapless navigation method based on deep reinforcement learning for USVs. Specifically, we carefully design the observation space, action space, reward function, and neural network for a navigation policy that allows the USV to reach the destination collision-free when equipped with only local sensors. Aiming at the sim-to-real transfer and slow convergence of deep reinforcement learning, this paper proposes a dynamics-free training and consistency strategy and designs domain randomization and adaptive curriculum learning. The method was evaluated using a range of tests applied to simulated and physical environments and was proven to work effectively in a real navigation environment.
引用
收藏
页数:23
相关论文
共 47 条
[1]  
Bousmalis K, 2018, IEEE INT CONF ROBOT, P4243
[2]  
Burda Yuri, 2018, CORR
[3]   Relative Position Estimation Between Two UWB Devices With IMUs [J].
Cossette, Charles Champagne ;
Shalaby, Mohammed ;
Saussie, David ;
Forbes, James Richard ;
Ny, Jerome Le .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (03) :4313-4320
[4]   High-Level Path Planning for an Autonomous Sailboat Robot Using Q-Learning [J].
da Silva Junior, Andouglas Goncalves ;
dos Santos, Davi Henrique ;
Fernandes de Negreiros, Alvaro Pinto ;
Boas de Souza Silva, Joao Moreno Vilas ;
Garcia Goncalves, Luiz Marcos .
SENSORS, 2020, 20 (06)
[5]   Learning Transferable Policies for Monocular Reactive MAV Control [J].
Daftry, Shreyansh ;
Bagnell, J. Andrew ;
Hebert, Martial .
2016 INTERNATIONAL SYMPOSIUM ON EXPERIMENTAL ROBOTICS, 2017, 1 :3-11
[6]  
Duan Y, 2017, ADV NEUR IN, V30
[7]  
Ganin Y, 2016, J MACH LEARN RES, V17
[8]   Unsupervised Monocular Depth Estimation with Left-Right Consistency [J].
Godard, Clement ;
Mac Aodha, Oisin ;
Brostow, Gabriel J. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6602-6611
[9]   Deep Reinforcement Learning for Mapless Navigation of a Hybrid Aerial Underwater Vehicle with Medium Transition [J].
Grando, Ricardo B. ;
de Jesus, Junior C. ;
Kich, Victor A. ;
Kolling, Alisson H. ;
Bortoluzzi, Nicolas P. ;
Pinheiro, Pedro M. ;
Neto, Armando A. ;
Drews, Paulo L. J., Jr. .
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, :1088-1094
[10]  
Graves A, 2017, PR MACH LEARN RES, V70