MSN: Mapless Short-Range Navigation Based on Time Critical Deep Reinforcement Learning

被引:11
作者
Li, Bohan [1 ,2 ]
Huang, Zhelong [1 ]
Chen, Tony Weitong [3 ]
Dai, Tianlun [1 ]
Zang, Yalei [1 ]
Xie, Wenbin [4 ]
Tian, Bo [5 ]
Cai, Ken [6 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing 211106, Peoples R China
[2] Minist Ind & Informat Technol, Key Lab Safety Crit Software, Collaborat Innovat Ctr Novel Software Technol & I, Suzhou 211106, Jiangsu, Peoples R China
[3] Univ Adelaide, Fac Sci Engn & Technol, Sch Comp Sci, Adelaide, SA 5005, Australia
[4] Army Engn Univ PLA, Coll Command & Control, Nanjing 211101, Jiangsu, Peoples R China
[5] Tianbot Robot Co Ltd, Nanjing 210043, Peoples R China
[6] Zhongkai Univ Agr & Engn, Coll Automat, Guangzhou 510225, Peoples R China
基金
中国国家自然科学基金;
关键词
Navigation; Robots; Reinforcement learning; Service robots; Collision avoidance; Production facilities; Transportation; mapless navigation; DDPG; path planning; robot motion planning; MOBILE ROBOTS;
D O I
10.1109/TITS.2022.3192480
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Automated vehicle(AV) based on reinforcement learning is an important part of the intelligent transportation system. However, currently, the performance of AV heavily that relies on the quality of maps and mapless navigation is one potential method for navigation in a strange and dynamic changing environment. Although many efforts are made on mapless navigation, they either need prior knowledge, rely on an exceptional constructed environment or simple feature fusion mechanism in the networks. In this paper, we proposed a deep reinforcement learning method, namely TC-DDPG, which is consisted of DDPG, multi-challenge deep learning networks and time-critical reward function. By comparing to existing approaches, TC-DDPG takes the cost of time into consideration and achieves better performance and converges more easily. A new open source simulator is proposed and extensive experiments are conducted to demonstrate the performance of the TC-DDPG, which outperforms comparing methods and achieves 62.9% less in time cost, 12.0% less in distance cost and about 90% fewer in numbers of model parameters.
引用
收藏
页码:8628 / 8637
页数:10
相关论文
共 44 条
[1]  
Aggarwal A., 2020, J. Comput. Sci., V16, P651
[2]  
Aggarwal A. K., 2013, PROC NIPS DEEP LEARN, P437
[3]  
Aggarwal AK., 2015, Int. J. Adv. Res. Elect. Electron. Instrum. Eng, V4, P8210
[4]  
Arora K., 2018, Handbook of Research on Advanced Concepts in Real-Time Image and Video Processing, P28, DOI 10.4018/978-1-5225-2848-7.ch002
[5]  
Bellemare MG, 2016, AAAI CONF ARTIF INTE, P1476
[6]   Visual navigation for mobile robots: A survey [J].
Bonin-Font, Francisco ;
Ortiz, Alberto ;
Oliver, Gabriel .
JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2008, 53 (03) :263-296
[7]   Autonomous Docking of Mobile Robots by Reinforcement Learning Tackling the Sparse Reward Problem [J].
Burgueno-Romer, A. M. ;
Ruiz-Sarmiento, J. R. ;
Gonzalez-Jimenez, J. .
ADVANCES IN COMPUTATIONAL INTELLIGENCE (IWANN 2021), PT II, 2021, 12862 :392-403
[8]   A Survey of Optical Flow Techniques for Robotics Navigation Applications [J].
Chao, Haiyang ;
Gu, Yu ;
Napolitano, Marcello .
JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2014, 73 (1-4) :361-372
[9]   Autonomous driving: cognitive construction and situation understanding [J].
Chen, Shitao ;
Jian, Zhiqiang ;
Huang, Yuhao ;
Chen, Yu ;
Zhou, Zhuoli ;
Zheng, Nanning .
SCIENCE CHINA-INFORMATION SCIENCES, 2019, 62 (08)
[10]  
Francois-Lavet V., 2015, ARXIV