Adaptive guidance and integrated navigation with reinforcement meta-learning

被引：66

作者：

Gaudet, Brian ^{[1
]}

Linares, Richard ^{[3
]}

Furfaro, Roberto ^{[1
,2
]}

机构：

[1] Univ Arizona, Dept Syst & Ind Engn, 1127 E James & Roger Way, Tucson, AZ 85721 USA

[2] Univ Arizona, Dept Aerosp & Mech Engn, Tucson, AZ 85721 USA

[3] MIT, Dept Aeronaut & Astronaut, Cambridge, MA 02139 USA

来源：

ACTA ASTRONAUTICA | 2020年 / 169卷

关键词：

Guidance; Meta learning; Reinforcement learning; Landing guidance;

D O I：

10.1016/j.actaastro.2020.01.007

中图分类号：

V [航空、航天];

学科分类号：

08 ; 0825 ;

摘要：

This paper proposes a novel adaptive guidance system developed using reinforcement meta-learning with a recurrent policy and value function approximator. The use of recurrent network layers allows the deployed policy to adapt in real time to environmental forces acting on the agent. We compare the performance of the DR/DV guidance law, an RL agent with a non-recurrent policy, and an RL agent with a recurrent policy in four challenging environments with unknown but highly variable dynamics. These tasks include a safe Mars landing with random engine failure and a landing on an asteroid with unknown environmental dynamics. We also demonstrate the ability of a RL meta-learning optimized policy to implement a guidance law using observations consisting of only Doppler radar altimeter readings in a Mars landing environment, and LIDAR altimeter readings in an asteroid landing environment thus integrating guidance and navigation.

引用

页码：180 / 190

页数：11

共 19 条

[1]

[Anonymous], 2016, ARXIV161105763

[2]

[Anonymous], P 22 AAS AIAA SPAC M

[3]

[Anonymous], OPTIMAL GUIDANCE LAW

[4]

[Anonymous], 2017, 34 INT C MACH LEARN

[5]

[Anonymous], ARXIV17022453

[6]

[Anonymous], 2016, ICLR

[7]

[Anonymous], 2003, THESIS U CALIFORNIA

[8]

Battin R.H., 1999, INTRO MATH METHODS A, P471

[9]

Chung JY, 2015, PR MACH LEARN RES, V37, P2067

[10]

Frans K., 2017, P INT C LEARN REPR

← 1 2 →