Learning Min-norm Stabilizing Control Laws for Systems with Unknown Dynamics

被引：0

作者：

Westenbroek, Tyler ^{[1
]}

Castaneda, Fernando ^{[2
]}

Agrawal, Ayush ^{[2
]}

Sastry, S. Shankar ^{[1
]}

Sreenath, Koushil ^{[2
]}

机构：

[1] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA

[2] Univ Calif Berkeley, Dept Mech Engn, Berkeley, CA 94720 USA

来源：

2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC) | 2020年

基金：

美国国家科学基金会;

关键词：

TO-STATE STABILITY; ADAPTIVE-CONTROL; LYAPUNOV;

D O I：

10.1109/cdc42340.2020.9304118

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper introduces a framework for learning a minimum-norm stabilizing controller for a system with unknown dynamics using model-free policy optimization methods. The approach begins by first designing a Control Lyapunov Function (CLF) for a (possibly inaccurate) dynamics model for the system, along with a function which specifies a minimum acceptable rate of energy dissipation for the CLF at different points in the state-space. Treating the energy dissipation condition as a constraint on the desired closed-loop behavior of the real-world system, we use penalty methods to formulate an unconstrained optimization problem over the parameters of a learned controller, which can be solved using model-free policy optimization algorithms using data collected from the plant. We discuss when the optimization learns a stabilizing controller for the real world system and derive conditions on the structure of the learned controller which ensure that the optimization is strongly convex, meaning the globally optimal solution can be found reliably. We validate the approach in simulation, first for a double pendulum, and then generalize the framework to learn stable walking controllers for underactuated bipedal robots using the Hybrid Zero Dynamics framework. By encoding a large amount of structure into the learning problem, we are able to learn stabilizing controllers for both systems with only minutes or even seconds of training data.

引用

页码：737 / 744

页数：8

共 50 条

[41] CONSTRUCTION OF STABILIZING AIRBORNE LABORATORY MOTION CONTROL LAWS FOR TESTING AIRCRAFT NAVIGATIONAL SYSTEMS
VASIN, YV
NOSOV, VR
FELDMAN, MI
SOVIET JOURNAL OF COMPUTER AND SYSTEMS SCIENCES, 1992, 30 (04): : 139 - 143
[42] The construction of stabilizing control laws of the motion of flying laboratory for the testing of navigation systems of aircrafts
Vasin, Yu.V.
Nosov, V.R.
Fel'dman, M.I.
Izvestiya Akademii Nauk: Tekhnicheskaia Kibernetika, 1991, (04): : 147 - 151
[43] A new methodology to compute stabilizing control laws for continuous-time LTV systems
Agulhari, Cristiano M.
Garcia, Germain
Tarbouriech, Sophie
Peres, Pedro L. D.
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2018, 28 (13) : 4045 - 4057
[44] Simple asymptotic stabilizing control laws for linear time-invariant hybrid systems
delaSen, M
CYBERNETICS AND SYSTEMS, 1997, 28 (07) : 547 - 570
[45] On stabilizing linear systems with input saturation via soft variable structure control laws
Roethig, Andreea
Adamy, Juergen
SYSTEMS & CONTROL LETTERS, 2016, 89 : 47 - 54
[46] Stabilizing Unknown Nonlinear Systems via Decentralized High-gain Adaptive Control
Kawano, Yu
Sun, Zhiyong
IFAC PAPERSONLINE, 2023, 56 (02): : 9209 - 9214
[47] Global Stabilizing Control for A Class of Stochastic High-order Nonlinear Systems with Unknown Control Direction
Zhang Jian
Mu Xiaowu
Wei Jumei
Liu Yungang
2015 34TH CHINESE CONTROL CONFERENCE (CCC), 2015, : 1699 - 1704
[48] Stabilizing control design for a class of high-order nonlinear systems with unknown but identical control coefficients
Sun, Zong-Yao
Liu, Yun-Gang
Zidonghua Xuebao/Acta Automatica Sinica, 2007, 33 (03): : 331 - 334
[49] Active Learning for Estimating Reachable Sets for Systems With Unknown Dynamics
Chakrabarty, Ankush
Danielson, Claus
Di Cairano, Stefano
Raghunathan, Arvind
IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (04) : 2531 - 2542
[50] Extreme learning control of surface vehicles with unknown dynamics and disturbances
Sun, Jing-Chao
Wang, Ning
Er, Meng Joo
Liu, Yan-Cheng
NEUROCOMPUTING, 2015, 167 : 535 - 542

← 1 2 3 4 5 →