Exploring Transfers between Earth-Moon Halo Orbits via Multi-Objective Reinforcement Learning

被引:0
作者
Sullivan, Christopher J. [1 ]
Bosanac, Natasha [1 ]
Anderson, Rodney L. [2 ]
Mashiku, Alinda K. [3 ]
Stuart, Jeffrey R. [2 ]
机构
[1] Univ Colorado, Colorado Ctr Astrodynam, Smead Aerosp Engn Sci, 429 UCB, Boulder, CO 80303 USA
[2] CALTECH, Jet Prop Lab, 4800 Oak Dr Dr, Pasadena, CA 91109 USA
[3] NASA, Goddard Space Flight Ctr, Nav & Mission Design Branch, 8800 Greenbelt Rd, Greenbelt, MD 20771 USA
来源
2021 IEEE AEROSPACE CONFERENCE (AEROCONF 2021) | 2021年
基金
美国国家航空航天局;
关键词
TRAJECTORY DESIGN; LOW-THRUST; NETWORKS;
D O I
10.1109/AERO50100.2021.9438267
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
Multi-Reward Proximal Policy Optimization, a multi-objective deep reinforcement learning algorithm, is used to examine the design space of low-thrust trajectories for a SmallSat transferring between two libration point orbits in the EarthMoon system. Using Multi-Reward Proximal Policy Optimization, multiple policies are simultaneously and efficiently trained on three distinct trajectory design scenarios. Each policy is trained to create a unique control scheme based on the trajectory design scenario and assigned reward function. Each reward function is defined using a set of objectives that are scaled via a unique combination of weights to balance guiding the spacecraft to the target mission orbit, incentivizing faster flight times, and penalizing propellant mass usage. Then, the policies are evaluated on the same set of perturbed initial conditions in each scenario to generate the propellant mass usage, flight time, and state discontinuities from a reference trajectory for each control scheme. The resulting low-thrust trajectories are used to examine a subset of the multi-objective trade space for the SmallSat trajectory design scenario. By autonomously constructing the solution space, insights into the required propellant mass, flight time, and transfer geometry are rapidly achieved.
引用
收藏
页数:13
相关论文
共 35 条
[1]   Information Systems and Renewable Energy in Algeria [J].
Abdelkader, Harrouz ;
Abbes, Meriem ;
Colak, Ilhami ;
Kayisli, Korhan .
PROCEEDINGS OF 2019 ALGERIAN LARGE ELECTRICAL NETWORK CONFERENCE (CAGRE), 2019, :1-5
[2]   Role of Invariant Manifolds in Low-Thrust Trajectory Design [J].
Anderson, Rodney L. ;
Lo, Martin W. .
JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 2009, 32 (06) :1921-1930
[3]  
Andrychowicz M., 2020, WHAT MATTERS ON POLI
[4]  
[Anonymous], 1967, THEORY ORBITS RESTRI, DOI DOI 10.1016/B978-0-12-395732-0.50001-5
[5]  
[Anonymous], 2020, 30 AIAA AAS SPAC FLI
[6]  
Barto AG, 1995, ANALYSIS, DESIGN AND EVALUATION OF MAN-MACHINE SYSTEMS 1995, VOLS 1 AND 2, P407
[7]   Trajectory design for a cislunar CubeSat leveraging dynamical systems techniques: The Lunar IceCube mission [J].
Bosanac, Natasha ;
Cox, Andrew D. ;
Howell, Kathleen C. ;
Folta, David C. .
ACTA ASTRONAUTICA, 2018, 144 :283-296
[8]   Real-time control for fuel-optimal Moon landing based on an interactive deep reinforcement learning algorithm [J].
Cheng, Lin ;
Wang, Zhenbo ;
Jiang, Fanghua .
ASTRODYNAMICS, 2019, 3 (04) :375-386
[9]   Rapid trajectory design in complex environments enabled by reinforcement learning and graph search strategies [J].
Das-Stuart, A. ;
Howell, K. C. ;
Folta, D. C. .
ACTA ASTRONAUTICA, 2020, 171 :172-195
[10]  
Davis DC., 2017, AASAIAA ASTRODYNAMIC