Follow the Clairvoyant: an Imitation Learning Approach to Optimal Control

被引：3

作者：

Martin, Andrea ^{[1
]}

Furieri, Luca ^{[1
]}

Dorfler, Florian ^{[2
]}

Lygeros, John ^{[2
]}

Ferrari-Trecate, Giancarlo ^{[1
]}

机构：

[1] Ecole Polytech Fed Lausanne, Inst Mech Engn, Lausanne, Switzerland

[2] Swiss Fed Inst Technol, Dept Informat Technol & Elect Engn, Zurich, Switzerland

来源：

IFAC PAPERSONLINE | 2023年 / 56卷 / 02期

基金：

瑞士国家科学基金会;

关键词：

optimal control; robust control; system level synthesis; imitation learning; dynamic regret; regret minimization; STABILITY;

D O I：

10.1016/j.ifacol.2023.10.1344

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We consider control of dynamical systems through the lens of competitive analysis. Most prior work in this area focuses on minimizing regret, that is, the loss relative to an ideal clairvoyant policy that has noncausal access to past, present, and future disturbances. Motivated by the observation that the optimal cost only provides coarse information about the ideal closed-loop behavior, we instead propose directly minimizing the tracking error relative to the optimal trajectories in hindsight, i. e., imitating the clairvoyant policy. By embracing a system level perspective, we present an efficient optimization-based approach for computing follow-the-clairvoyant (FTC) safe controllers. We prove that these attain minimal regret if no constraints are imposed on the noncausal benchmark. In addition, we present numerical experiments to show that our policy retains the hallmark of competitive algorithms of interpolating between classical H-2 and H-infinity control laws - while consistently outperforming regret minimization methods in constrained scenarios thanks to the superior ability to chase the clairvoyant. Copyright (c) 2023 The Authors.

引用

页码：2589 / 2594

页数：6

共 15 条

[1]

Boyd S., 1997, LINEAR MATRIX INEQUA, DOI [10.1137/1.9781611970777, DOI 10.1137/1.9781611970777]

[2] A System Level Approach to Regret Optimal Control [J].

Didier, Alexandre ;

Sieber, Jerome ;

Zeilinger, Melanie N. .

IEEE CONTROL SYSTEMS LETTERS, 2022, 6 :2792-2797

[3] Competitive Control [J].

Goel, Gautam ;

Hassibi, Babak .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (09) :5162-5173

[4] Regret-Optimal Estimation and Control [J].

Goel, Gautam ;

Hassibi, Babak .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (05) :3041-3053

[5]

Hassibi B., 1999, INDEFINITE QUADRATIC

[6] Imitation Learning: A Survey of Learning Methods [J].

Hussein, Ahmed ;

Gaber, Mohamed Medhat ;

Elyan, Eyad ;

Jayne, Chrisina .

ACM COMPUTING SURVEYS, 2017, 50 (02)

[7]

Martin A, 2022, PR MACH LEARN RES, V168

[8] Constrained model predictive control: Stability and optimality [J].

Mayne, DQ ;

Rawlings, JB ;

Rao, CV ;

Scokaert, POM .

AUTOMATICA, 2000, 36 (06) :789-814

[9]

Osa T, 2018, Foundations and Trends in Robotics, V7, P1, DOI [10.1561/2300000053, 10.1561/2300000053, DOI 10.1561/2300000053]

[10]

Rawlings JamesBlake., 2017, MODEL PREDICTIVE CON, V2

← 1 2 →