Reinforcement imitation learning for reliable and efficient autonomous navigation in complex environments

被引:0
作者
Kumar D. [1 ]
机构
[1] Computer Science and Engineering, United College of Engineering and Research, Uttar Pradesh, Naini, Prayagraj
关键词
Autonomous navigation; Deep neural networks; Dynamic environments; Imitation learning; Q-learning; Reinforcement learning;
D O I
10.1007/s00521-024-09678-y
中图分类号
学科分类号
摘要
Reinforcement learning (RL) and imitation learning (IL) are quite two useful machine learning techniques that were shown to be potential in enhancing navigation performance. Basically, both of these methods try to find a policy decision function in a reinforcement learning fashion or through imitation. In this paper, we propose a novel algorithm named Reinforcement Imitation Learning (RIL) that naturally combines RL and IL together in accelerating more reliable and efficient navigation in dynamic environments. RIL is a hybrid approach that utilizes RL for policy optimization and IL as some kind of learning from expert demonstrations with the inclusion of guidance. We present the comparison of the convergence of RIL with conventional RL and IL to provide the support for our algorithm’s performance in a dynamic environment with moving obstacles. The results of the testing indicate that the RIL algorithm has better collision avoidance and navigation efficiency than traditional methods. The proposed RIL algorithm has broad application prospects in many specific areas such as an autonomous driving, unmanned aerial vehicles, and robots. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.
引用
收藏
页码:11945 / 11961
页数:16
相关论文
共 32 条
  • [21] Zhu Y., Mottaghi R., Kolve E., Lim J.J., Gupta A., Fei-Fei L., Farhadi A., Target-driven visual navigation in indoor scenes using deep reinforcement learning, 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 3357-3364, (2017)
  • [22] Abbeel P., Ng A.Y., Apprenticeship learning via inverse reinforcement learning, Proceedings of the Twenty-First International Conference on Machine Learning, (2004)
  • [23] Florensa C., Held D., Wulfmeier M., Zhang M., Abbeel P., Reverse curriculum generation for reinforcement learning, Conference on Robot Learning, pp. 482-495, (2017)
  • [24] Kurutach T., Clavera I., Duan Y., Tamar A., Abbeel P., Model-ensemble trust-region policy optimization., (2018)
  • [25] Pribeanu C., Balog A., Iordache D.D., Measuring the perceived quality of an AR-based learning application: a multidimensional model, Interact Learn Environ, 25, 4, pp. 482-495, (2017)
  • [26] Xu C., Peng Z., Hu X., Zhang W., Chen L., An F., Fpga-based low-visibility enhancement accelerator for video sequence by adaptive histogram equalization with dynamic clip-threshold, IEEE Trans Circuits Syst I Regul Pap, 67, 11, pp. 3954-3964, (2020)
  • [27] Sadigh D., Sastry S., Seshia S.A., Dragan A.D., Planning for autonomous cars that leverage effects on human actions, Robot Sci Syst, 2, pp. 1-9, (2016)
  • [28] Kendall A., Gal Y., What uncertainties do we need in bayesian deep learning for computer vision?, Adv Neural Inf Process Syst, 30, (2017)
  • [29] Mirowski P., Pascanu R., Viola F., Soyer H., Ballard A.J., Banino A., Denil M., Goroshin R., Sifre L., Kavukcuoglu K., Et al., Learning to navigate in complex environments, (2016)
  • [30] Amodei D., Ananthanarayanan S., Anubhai R., Bai J., Battenberg E., Case C., Casper J., Catanzaro B., Cheng Q., Chen G., Et al., Deep speech 2: End-to-end speech recognition in english and mandarin, International Conference on Machine Learning, pp. 173-182, (2016)