Representation Learning and Reinforcement Learning for Dynamic Complex Motion Planning System

被引:6
作者
Zhou, Chengmin [1 ,2 ]
Huang, Bingding [2 ]
Franti, Pasi [1 ]
机构
[1] Univ Eastern Finland, Sch Comp, Machine Learning Grp, Joensuu 80100, Finland
[2] Shenzhen Technol Univ, Coll Big Data & Internet, Shenzhen 518118, Peoples R China
关键词
Intelligent robot; motion planning; reinforcement learning (RL); representation learning;
D O I
10.1109/TNNLS.2023.3247160
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Indoor motion planning challenges researchers because of the high density and unpredictability of moving obstacles. Classical algorithms work well in the case of static obstacles but suffer from collisions in the case of dense and dynamic obstacles. Recent reinforcement learning (RL) algorithms provide safe solutions for multiagent robotic motion planning systems. However, these algorithms face challenges in convergence: slow convergence speed and suboptimal converged result. Inspired by RL and representation learning, we introduced the ALN-DSAC: a hybrid motion planning algorithm where attention-based long short-term memory (LSTM) and novel data replay combine with discrete soft actor-critic (SAC). First, we implemented a discrete SAC algorithm, which is the SAC in the setting of discrete action space. Second, we optimized existing distancebased LSTM encoding by attention-based encoding to improve the data quality. Third, we introduced a novel data replay method by combining the online learning and offline learning to improve the efficacy of data replay. The convergence of our ALN-DSAC outperforms that of the trainable state of the arts. Evaluations demonstrate that our algorithm achieves nearly 100% success with less time to reach the goal in motion planning tasks when compared to the state of the arts. The test code is available at https://github.com/CHUENGMINCHOU/ALN-DSAC.
引用
收藏
页码:11049 / 11063
页数:15
相关论文
共 43 条
[1]   Social LSTM: Human Trajectory Prediction in Crowded Spaces [J].
Alahi, Alexandre ;
Goel, Kratarth ;
Ramanathan, Vignesh ;
Robicquet, Alexandre ;
Li Fei-Fei ;
Savarese, Silvio .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :961-971
[2]  
Bai ZW, 2018, CHIN AUTOM CONGR, P1610, DOI 10.1109/CAC.2018.8623233
[3]  
Baird L., 1995, Machine Learning. Proceedings of the Twelfth International Conference on Machine Learning, P30
[4]   Improved Soft Actor-Critic: Mixing Prioritized Off-Policy Samples With On-Policy Experiences [J].
Banerjee, Chayan ;
Chen, Zhiyong ;
Noman, Nasimul .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) :3121-3129
[5]  
Bas E., 2019, Basics of Probability and Stochastic Processes, P179, DOI DOI 10.1007/978-3-030-32323-3
[6]   Representation Learning: A Review and New Perspectives [J].
Bengio, Yoshua ;
Courville, Aaron ;
Vincent, Pascal .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828
[7]  
Boyd SP., 2004, Convex optimization, DOI 10.1017/CBO9780511804441
[8]  
Bry A, 2011, IEEE INT CONF ROBOT
[9]   Design and Experimental Validation of Deep Reinforcement Learning-Based Fast Trajectory Planning and Control for Mobile Robot in Unknown Environment [J].
Chai, Runqi ;
Niu, Hanlin ;
Carrasco, Joaquin ;
Arvin, Farshad ;
Yin, Hujun ;
Lennox, Barry .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) :5778-5792
[10]  
Chen CA, 2020, Arxiv, DOI arXiv:1909.13165