Mapless navigation for UAVs via reinforcement learning from demonstrations

被引:3
作者
Yang, JiaNan [1 ]
Lu, ShengAo [1 ]
Han, MingHao [1 ]
Li, YunPeng [1 ]
Ma, YuTing [1 ]
Lin, ZeFeng [1 ]
Li, HaoWei [1 ]
机构
[1] Harbin Inst Technol, Sch Astronaut, Harbin 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
autonomous navigation; reinforcement learing; imitation learning; path planning;
D O I
10.1007/s11431-022-2292-3
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
This paper is concerned with the problems of mapless navigation for unmanned aerial vehicles in the scenarios with limited sensor accuracy and computing capability. A novel learning-based algorithm called soft actor-critic from demonstrations (SACfD) is proposed, integrating reinforcement learning with imitation learning. Specifically, the maximum entropy reinforcement learning framework is introduced to enhance the exploration capability of the algorithm, upon which the paper explores a way to sufficiently leverage demonstration data to significantly accelerate the convergence rate while improving policy performance reliably. Further, the proposed algorithm enables an implementation of mapless navigation for unmanned aerial vehicles and experimental results show that it outperforms the existing algorithms.
引用
收藏
页码:1263 / 1270
页数:8
相关论文
共 24 条
  • [1] Andrychowicz M, ARXIV
  • [2] Atkeson C.G., 1997, P 14 INT C MACHINE L, P12
  • [3] Observed-Mode-Dependent State Estimation of Hidden Semi-Markov Jump Linear Systems
    Cai, Bo
    Zhang, Lixian
    Shi, Yang
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2020, 65 (01) : 442 - 449
  • [4] Codevilla F, 2018, IEEE INT CONF ROBOT, P4693
  • [5] Fujimoto S, 2018, PR MACH LEARN RES, V80
  • [6] Gandhi D, 2017, IEEE INT C INT ROBOT, P3948, DOI 10.1109/IROS.2017.8206247
  • [7] Haarnoja T, 2018, PR MACH LEARN RES, V80
  • [8] Haarnoja Tuomas, arXiv
  • [9] Hester T, 2018, AAAI CONF ARTIF INTE, P3223
  • [10] Kuefler A, 2017, IEEE INT VEH SYM, P204, DOI 10.1109/IVS.2017.7995721