An Advisor-Based Architecture for a Sample-Efficient Training of Autonomous Navigation Agents with Reinforcement Learning

被引:0
作者
Wijesinghe, Rukshan Darshana [1 ,2 ]
Tissera, Dumindu [1 ,2 ]
Vithanage, Mihira Kasun [2 ,3 ]
Xavier, Alex [2 ,4 ]
Fernando, Subha [2 ,3 ]
Samarawickrama, Jayathu [1 ,2 ]
机构
[1] Univ Moratuwa, Fac Engn, Dept Elect & Telecommun Engn, Moratuwa 10400, Sri Lanka
[2] Univ Moratuwa, CODEGEN QBITS LAB, Moratuwa 10400, Sri Lanka
[3] Univ Moratuwa, Fac Informat Technol, Dept Computat Math, Moratuwa 10400, Sri Lanka
[4] Univ Moratuwa, Fac Engn, Dept Comp Sci & Engn, Moratuwa 10400, Sri Lanka
关键词
advisor-based architecture; autonomous agents; reinforcement learning; ROBOTICS;
D O I
10.3390/robotics12050133
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Recent advancements in artificial intelligence have enabled reinforcement learning (RL) agents to exceed human-level performance in various gaming tasks. However, despite the state-of-the-art performance demonstrated by model-free RL algorithms, they suffer from high sample complexity. Hence, it is uncommon to find their applications in robotics, autonomous navigation, and self-driving, as gathering many samples is impractical in real-world hardware systems. Therefore, developing sample-efficient learning algorithms for RL agents is crucial in deploying them in real-world tasks without sacrificing performance. This paper presents an advisor-based learning algorithm, incorporating prior knowledge into the training by modifying the deep deterministic policy gradient algorithm to reduce the sample complexity. Also, we propose an effective method of employing an advisor in data collection to train autonomous navigation agents to maneuver physical platforms, minimizing the risk of collision. We analyze the performance of our methods with the support of simulation and physical experimental setups. Experiments reveal that incorporating an advisor into the training phase significantly reduces the sample complexity without compromising the agent's performance compared to various benchmark approaches. Also, they show that the advisor's constant involvement in the data collection process diminishes the agent's performance, while the limited involvement makes training more effective.
引用
收藏
页数:27
相关论文
共 59 条
[1]   Reinforcement Learning based Recommender Systems: A Survey [J].
Afsar, M. Mehdi ;
Crump, Trafford ;
Far, Behrouz .
ACM COMPUTING SURVEYS, 2023, 55 (07)
[2]  
Amini A, 2020, IEEE ROBOT AUTOM LET, V5, P1143, DOI [10.29209/id.v5i1.100, 10.1109/LRA.2020.2966414]
[3]   Human engagement providing evaluative and informative advice for interactive reinforcement learning [J].
Bignold, Adam ;
Cruz, Francisco ;
Dazeley, Richard ;
Vamplew, Peter ;
Foale, Cameron .
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (25) :18215-18230
[4]   Persistent rule-based interactive reinforcement learning [J].
Bignold, Adam ;
Cruz, Francisco ;
Dazeley, Richard ;
Vamplew, Peter ;
Foale, Cameron .
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (32) :23411-23428
[5]   SegVisRL: development of a robot's neural visuomotor and planning system for lunar exploration [J].
Blum, Tamir ;
Paillet, Gabin ;
Masawat, Watcharawut ;
Yoshida, Kazuya .
ADVANCED ROBOTICS, 2021, 35 (21-22) :1359-1373
[6]   A Survey on Interactive Reinforcement Learning: Design Principles and Open Challenges [J].
Cruz, Christian Arzate ;
Igarashi, Takeo .
PROCEEDINGS OF THE 2020 ACM DESIGNING INTERACTIVE SYSTEMS CONFERENCE (DIS 2020), 2020, :1195-1209
[7]  
Dadvar M, 2022, Arxiv, DOI arXiv:2210.01955
[8]   Learning from Demonstrations in Human-Robot Collaborative Scenarios: A Survey [J].
Daniel Sosa-Ceron, Arturo ;
Gustavo Gonzalez-Hernandez, Hugo ;
Antonio Reyes-Avendano, Jorge .
ROBOTICS, 2022, 11 (06)
[9]  
Fujimoto S, 2018, PR MACH LEARN RES, V80
[10]  
Gu SX, 2016, PR MACH LEARN RES, V48