We develop a computationally efficient learning-based forward-backward stochastic differential equa-tions (FBSDE) controller for both continuous and hybrid dynamical (HD) systems subject to stochas-tic noise and state constraints. Solutions to stochastic optimal control (SOC) problems satisfy the Hamilton-Jacobi-Bellman (HJB) equation. Using current FBSDE-based solutions, the optimal control can be obtained from the HJB equations using deep neural networks (e.g., long short-term memory (LSTM) networks). To ensure the learned controller respects the constraint boundaries, we enforce the state constraints using a soft penalty function. In addition to previous works, we adapt the deep FBSDE (DFBSDE) control framework to handle HD systems consisting of continuous dynamics and a deterministic discrete state change. We demonstrate our proposed algorithm in simulation on a continuous nonlinear system (cart-pole) and a hybrid nonlinear system (five-link biped).& COPY; 2023 Elsevier Ltd. All rights reserved.
机构:
NYU, Tandon Sch Engn, New York, NY 10012 USA
Max Planck Inst Intelligent Syst Tubingen, Tubingen, GermanyNYU, Tandon Sch Engn, New York, NY 10012 USA
Viereck, Julian
Righetti, Ludovic
论文数: 0引用数: 0
h-index: 0
机构:
NYU, Tandon Sch Engn, New York, NY 10012 USA
Max Planck Inst Intelligent Syst Tubingen, Tubingen, GermanyNYU, Tandon Sch Engn, New York, NY 10012 USA
Righetti, Ludovic
[J].
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021),
2021,
: 4905
-
4911
机构:
NYU, Tandon Sch Engn, New York, NY 10012 USA
Max Planck Inst Intelligent Syst Tubingen, Tubingen, GermanyNYU, Tandon Sch Engn, New York, NY 10012 USA
Viereck, Julian
Righetti, Ludovic
论文数: 0引用数: 0
h-index: 0
机构:
NYU, Tandon Sch Engn, New York, NY 10012 USA
Max Planck Inst Intelligent Syst Tubingen, Tubingen, GermanyNYU, Tandon Sch Engn, New York, NY 10012 USA
Righetti, Ludovic
[J].
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021),
2021,
: 4905
-
4911