Computing Stabilizing Feedback Gains via a Model-Free Policy Gradient Method

被引:6
|
作者
Ozaslan, Ibrahim K. [1 ]
Mohammadi, Hesameddin [1 ]
Jovanovic, Mihailo R. [1 ]
机构
[1] Univ Southern Calif, Ming Hsieh Dept Elect & Comp Engn, Los Angeles, CA 90089 USA
来源
IEEE CONTROL SYSTEMS LETTERS | 2022年 / 7卷
基金
美国国家科学基金会;
关键词
Costs; Convergence; Linear systems; Gradient methods; Computational modeling; Linear programming; Complexity theory; Data-driven control; linear quadratic regulator; model-free control; nonconvex optimization; random search method; reinforcement learning; sample complexity; CONVERGENCE;
D O I
10.1109/LCSYS.2022.3188180
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In spite of the lack of convexity, convergence and sample complexity properties were recently established for the random search method applied to the linear quadratic regulator (LQR) problem. Since policy gradient approaches require an initial stabilizing controller, we propose a model-free algorithm that searches over the set of state-feedback gains and returns a stabilizing controller in a finite number of iterations. Our algorithm involves a sequence of relaxed LQR problems for which the associated domains converge to the set of stabilizing controllers for the original continuous-time linear time-invariant system. Starting from a stabilizing controller for the relaxed problem, the proposed approach alternates between updating the controller via policy gradient iterations and decreasing relaxation parameter in the LQR cost while preserving stability at all iterations. By properly tuning the relaxation parameter updates we ensure that the cost values do not exceed a uniform threshold and establish computable bounds on the total number of iterations.
引用
收藏
页码:407 / 412
页数:6
相关论文
共 50 条
  • [1] A convergent algorithm for computing stabilizing static output feedback gains
    Yu, JT
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2004, 49 (12) : 2271 - 2275
  • [2] A Model-Free H∞ Control Method Based on Off-Policy With Output Data Feedback
    Li Z.
    Fan J.-L.
    Jiang Y.
    Chai T.-Y.
    Fan, Jia-Lu (jlfan@mail.neu.edu.cn), 1600, Science Press (47): : 2182 - 2193
  • [3] Model-free Reinforcement Learning of Semantic Communication by Stochastic Policy Gradient
    Beck, Edgar
    Bockelmann, Carsten
    Dekorsy, Armin
    2024 IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING FOR COMMUNICATION AND NETWORKING, ICMLCN 2024, 2024, : 367 - 373
  • [4] On a New Paradigm for Stock Trading Via a Model-Free Feedback Controller
    Barmish, B. Ross
    Primbs, James A.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2016, 61 (03) : 662 - 676
  • [5] Deterministic policy gradient adaptive dynamic programming for model-free optimal control
    Zhang, Yongwei
    Zhao, Bo
    Liu, Derong
    NEUROCOMPUTING, 2020, 387 : 40 - 50
  • [6] Model-Free Nonlinear Feedback Optimization
    He, Zhiyu
    Bolognani, Saverio
    He, Jianping
    Dorfler, Florian
    Guan, Xinping
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (07) : 4554 - 4569
  • [7] Combining Model-Based Design and Model-Free Policy Optimization to Learn Safe, Stabilizing Controllers
    Westenbroek, Tyler
    Agrawal, Ayush
    Castaneda, Fernando
    Sastry, S. Shankar
    Sreenath, Koushil
    IFAC PAPERSONLINE, 2021, 54 (05): : 19 - 24
  • [8] Gradient flow approach to computing LQ optimal output feedback gains
    Yan, Wei-Yong
    Teo, Kok L.
    Moore, John B.
    Optimal Control Applications and Methods, 1994, 15 (01) : 67 - 75
  • [9] Plume Tracing via Model-Free Reinforcement Learning Method
    Hu, Hangkai
    Song, Shiji
    Chen, C. L. Phillip
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (08) : 2515 - 2527
  • [10] Model-free Control Design Using Policy Gradient Reinforcement Learning in LPV Framework
    Bao, Yajie
    Velni, Javad Mohammadpour
    2021 EUROPEAN CONTROL CONFERENCE (ECC), 2021, : 150 - 155