Exploring Transfers between Earth-Moon Halo Orbits via Multi-Objective Reinforcement Learning

被引:0
作者
Sullivan, Christopher J. [1 ]
Bosanac, Natasha [1 ]
Anderson, Rodney L. [2 ]
Mashiku, Alinda K. [3 ]
Stuart, Jeffrey R. [2 ]
机构
[1] Univ Colorado, Colorado Ctr Astrodynam, Smead Aerosp Engn Sci, 429 UCB, Boulder, CO 80303 USA
[2] CALTECH, Jet Prop Lab, 4800 Oak Dr Dr, Pasadena, CA 91109 USA
[3] NASA, Goddard Space Flight Ctr, Nav & Mission Design Branch, 8800 Greenbelt Rd, Greenbelt, MD 20771 USA
来源
2021 IEEE AEROSPACE CONFERENCE (AEROCONF 2021) | 2021年
基金
美国国家航空航天局;
关键词
TRAJECTORY DESIGN; LOW-THRUST; NETWORKS;
D O I
10.1109/AERO50100.2021.9438267
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
Multi-Reward Proximal Policy Optimization, a multi-objective deep reinforcement learning algorithm, is used to examine the design space of low-thrust trajectories for a SmallSat transferring between two libration point orbits in the EarthMoon system. Using Multi-Reward Proximal Policy Optimization, multiple policies are simultaneously and efficiently trained on three distinct trajectory design scenarios. Each policy is trained to create a unique control scheme based on the trajectory design scenario and assigned reward function. Each reward function is defined using a set of objectives that are scaled via a unique combination of weights to balance guiding the spacecraft to the target mission orbit, incentivizing faster flight times, and penalizing propellant mass usage. Then, the policies are evaluated on the same set of perturbed initial conditions in each scenario to generate the propellant mass usage, flight time, and state discontinuities from a reference trajectory for each control scheme. The resulting low-thrust trajectories are used to examine a subset of the multi-objective trade space for the SmallSat trajectory design scenario. By autonomously constructing the solution space, insights into the required propellant mass, flight time, and transfer geometry are rapidly achieved.
引用
收藏
页数:13
相关论文
共 35 条
[21]   Human-level control through deep reinforcement learning [J].
Mnih, Volodymyr ;
Kavukcuoglu, Koray ;
Silver, David ;
Rusu, Andrei A. ;
Veness, Joel ;
Bellemare, Marc G. ;
Graves, Alex ;
Riedmiller, Martin ;
Fidjeland, Andreas K. ;
Ostrovski, Georg ;
Petersen, Stig ;
Beattie, Charles ;
Sadik, Amir ;
Antonoglou, Ioannis ;
King, Helen ;
Kumaran, Dharshan ;
Wierstra, Daan ;
Legg, Shane ;
Hassabis, Demis .
NATURE, 2015, 518 (7540) :529-533
[22]  
NASA, 2020, MARCO MARS CUB ON
[23]  
Parker J. S., 2014, Low-energy Lunar Trajectory Design, VVol. 12, DOI [10.1002/9781118855065, DOI 10.1002/9781118855065]
[24]   Multi-objective transfer to libration-point orbits via the mixed low-thrust and invariant-manifold approach [J].
Peng, Haijun ;
Chen, Biaosong ;
Wu, Zhigang .
NONLINEAR DYNAMICS, 2014, 77 (1-2) :321-338
[25]  
Schoolcraft Josh., 2017, SPACE OPERATIONS CON, DOI [DOI 10.1007/978-3-319-51941-8_10, 10.1007/978-3-319-51941-8_10]
[26]   Enabling the Future: Crowdsourced 3D-printed Prosthetics as a Model for Open Source Assistive Technology Innovation and Mutual Aid [J].
Schull, Jon .
ASSETS'15: PROCEEDINGS OF THE 17TH INTERNATIONAL ACM SIGACCESS CONFERENCE ON COMPUTERS & ACCESSIBILITY, 2015, :1-1
[27]  
Schulman J, 2017, PROXIMAL POLICY OPTI
[28]  
Scorsoglio A., 2019, ADV ASTRONAUTICAL SC, P1
[29]  
Shah V., 2016, AIAA AAS ASTR SPEC C
[30]   Mastering the game of Go with deep neural networks and tree search [J].
Silver, David ;
Huang, Aja ;
Maddison, Chris J. ;
Guez, Arthur ;
Sifre, Laurent ;
van den Driessche, George ;
Schrittwieser, Julian ;
Antonoglou, Ioannis ;
Panneershelvam, Veda ;
Lanctot, Marc ;
Dieleman, Sander ;
Grewe, Dominik ;
Nham, John ;
Kalchbrenner, Nal ;
Sutskever, Ilya ;
Lillicrap, Timothy ;
Leach, Madeleine ;
Kavukcuoglu, Koray ;
Graepel, Thore ;
Hassabis, Demis .
NATURE, 2016, 529 (7587) :484-+