Reinforcement imitation learning for reliable and efficient autonomous navigation in complex environments

被引：0

作者：

Kumar D. ^{[1
]}

机构：

[1] Computer Science and Engineering, United College of Engineering and Research, Uttar Pradesh, Naini, Prayagraj

来源：

Neural Computing and Applications | 2024年 / 36卷 / 20期

关键词：

Autonomous navigation; Deep neural networks; Dynamic environments; Imitation learning; Q-learning; Reinforcement learning;

D O I：

10.1007/s00521-024-09678-y

中图分类号：

学科分类号：

摘要：

Reinforcement learning (RL) and imitation learning (IL) are quite two useful machine learning techniques that were shown to be potential in enhancing navigation performance. Basically, both of these methods try to find a policy decision function in a reinforcement learning fashion or through imitation. In this paper, we propose a novel algorithm named Reinforcement Imitation Learning (RIL) that naturally combines RL and IL together in accelerating more reliable and efficient navigation in dynamic environments. RIL is a hybrid approach that utilizes RL for policy optimization and IL as some kind of learning from expert demonstrations with the inclusion of guidance. We present the comparison of the convergence of RIL with conventional RL and IL to provide the support for our algorithm’s performance in a dynamic environment with moving obstacles. The results of the testing indicate that the RIL algorithm has better collision avoidance and navigation efficiency than traditional methods. The proposed RIL algorithm has broad application prospects in many specific areas such as an autonomous driving, unmanned aerial vehicles, and robots. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.

引用

页码：11945 / 11961

页数：16

共 32 条

[1] Mnih V., Kavukcuoglu K., Silver D., Rusu A.A., Veness J., Bellemare M.G., Graves A., Riedmiller M., Fidjeland A.K., Ostrovski G., Et al., Human-level control through deep reinforcement learning, Nature, 518, 7540, pp. 529-533, (2015)
[2] Argall B.D., Chernova S., Veloso M., Browning B., A survey of robot learning from demonstration, Robot Autonom Syst, 57, 5, pp. 469-483, (2009)
[3] Silver D., Hubert T., Schrittwieser J., Antonoglou I., Lai M., Guez A., Lanctot M., Sifre L., Kumaran D., Graepel T., Et al., A general reinforcement learning algorithm that masters chess, shogi, and go through self-play, Science, 362, 6419, pp. 1140-1144, (2018)
[4] Zhu Z., Zhao H., A survey of deep RL and IL for autonomous driving policy learning, IEEE Trans Intell Transp Syst, 23, 9, pp. 14043-14065, (2021)
[5] Peng P., Barnes M., Wang C., Wang W., Li S., Swanson H.L., Dardick W., Tao S., A meta-analysis on the relation between reading and working memory, Psychol Bull, 144, 1, (2018)
[6] Sadeghi F., Levine S., Cad2rl: Real single-image flight without a single real image., (2016)
[7] Liu H., Huang Z., Wu J., Lv C., Improved deep reinforcement learning with expert demonstrations for urban autonomous driving, 2022 IEEE Intelligent Vehicles Symposium (IV), pp. 921-928, (2022)
[8] Liu H., Liu H.H., Chi C., Zhai Y., Zhan X., Navigation information augmented artificial potential field algorithm for collision avoidance in UAV formation flight, Aerosp Syst, 3, pp. 229-241, (2020)
[9] Kumar D., Pandey M., An effective and secure data sharing in p2p network using biased contribution index based rumour riding protocol (bcirr), Opt Mem Neural Netw, 29, 4, pp. 336-353, (2020)
[10] Kumar D., Dubey A.K., Pandey M., Time and position aware resource search algorithm for the mobile peer-to-peer network using ant colony optimisation, Int J Commun Netw Distrib Syst, 28, 6, pp. 621-654, (2022)

← 1 2 3 4 →