Provably Safe Reinforcement Learning via Action Projection Using Reachability Analysis and Polynomial Zonotopes

被引:9
|
作者
Kochdumper, Niklas [1 ,2 ]
Krasowski, Hanna [1 ]
Wang, Xiao [1 ]
Bak, Stanley [2 ]
Althoff, Matthias [1 ]
机构
[1] Tech Univ Munich, Dept Comp Engn, D-85748 Garching, Germany
[2] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY 11794 USA
来源
基金
欧洲研究理事会;
关键词
Safety; Reinforcement learning; Reachability analysis; Optimization; Generators; Training; Measurement errors; Action projection; reach-avoid problems; reachability analysis; reinforcement learning; CONVEX-HULL;
D O I
10.1109/OJCSYS.2023.3256305
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
While reinforcement learning produces very promising results for many applications, its main disadvantage is the lack of safety guarantees, which prevents its use in safety-critical systems. In this work, we address this issue by a safety shield for nonlinear continuous systems that solve reach-avoid tasks. Our safety shield prevents applying potentially unsafe actions from a reinforcement learning agent by projecting the proposed action to the closest safe action. This approach is called action projection and is implemented via mixed-integer optimization. The safety constraints for action projection are obtained by applying parameterized reachability analysis using polynomial zonotopes, which enables to accurately capture the nonlinear effects of the actions on the system. In contrast to other state-of-the-art approaches for action projection, our safety shield can efficiently handle input constraints and dynamic obstacles, eases incorporation of the spatial robot dimensions into the safety constraints, guarantees robust safety despite process noise and measurement errors, and is well suited for high-dimensional systems, as we demonstrate on several challenging benchmark systems.
引用
收藏
页码:79 / 92
页数:14
相关论文
共 50 条
  • [1] Reachability Analysis Using Constrained Polynomial Logical Zonotopes
    Hafez, Ahmad
    Jiang, Frank J.
    Johansson, Karl H.
    Alanwar, Amr
    IEEE CONTROL SYSTEMS LETTERS, 2024, 8 : 2277 - 2282
  • [2] Reachability Analysis for Linear Systems with Uncertain Parameters using Polynomial Zonotopes
    Huang, Yushen
    Luo, Ertai
    Bak, Stanley
    Sun, Yifan
    arXiv,
  • [3] Reachability analysis for linear systems with uncertain parameters using polynomial zonotopes
    Huang, Yushen
    Luo, Ertai
    Bak, Stanley
    Sun, Yifan
    NONLINEAR ANALYSIS-HYBRID SYSTEMS, 2025, 56
  • [4] Safe Reinforcement Learning Using Black-Box Reachability Analysis
    Selim, Mahmoud
    Alanwar, Amr
    Kousik, Shreyas
    Gao, Grace
    Pavone, Marco
    Johansson, Karl H.
    arXiv, 2022,
  • [5] Safe Reinforcement Learning Using Black-Box Reachability Analysis
    Selim, Mahmoud
    Alanwar, Amr
    Kousik, Shreyas
    Gao, Grace
    Pavone, Marco
    Johansson, Karl H.
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (04) : 10665 - 10672
  • [6] Sparse Polynomial Zonotopes: A Novel Set Representation for Reachability Analysis
    Kochdumper, Niklas
    Althoff, Matthias
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (09) : 4043 - 4058
  • [7] Iterative Reachability Estimation for Safe Reinforcement Learning
    Ganai, Milan
    Gong, Zheng
    Yu, Chenning
    Herbert, Sylvia
    Gao, Sicun
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [8] Safe Reinforcement Learning via Projection on a Safe Set: How to Achieve Optimality?
    Gros, Sebastien
    Zanon, Mario
    Bemporad, Alberto
    IFAC PAPERSONLINE, 2020, 53 (02): : 8076 - 8081
  • [9] Safe Exploration in Reinforcement Learning by Reachability Analysis over Learned Models
    Wang, Yuning
    Zhu, He
    COMPUTER AIDED VERIFICATION, PT III, CAV 2024, 2024, 14683 : 232 - 255
  • [10] Reducing Safety Interventions in Provably Safe Reinforcement Learning
    Thumm, Jakob
    Pelat, Guillaume
    Althoff, Matthias
    2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 7515 - 7522