Runtime-Safety-Guided Policy Repair

被引:3
|
作者
Zhou, Weichao [1 ]
Gao, Ruihan [2 ]
Kim, BaekGyu [3 ]
Kang, Eunsuk [4 ]
Li, Wenchao [1 ]
机构
[1] Boston Univ, Boston, MA 02215 USA
[2] Nanyang Technol Univ, Singapore, Singapore
[3] Toyota Motor North Amer R&D, Mountain View, CA USA
[4] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
来源
基金
美国国家科学基金会;
关键词
D O I
10.1007/978-3-030-60508-7_7
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We study the problem of policy repair for learning-based control policies in safety-critical settings. We consider an architecture where a high-performance learning-based control policy (e.g. one trained as a neural network) is paired with a model-based safety controller. The safety controller is endowed with the abilities to predict whether the trained policy will lead the system to an unsafe state, and take over control when necessary. While this architecture can provide added safety assurances, intermittent and frequent switching between the trained policy and the safety controller can result in undesirable behaviors and reduced performance. We propose to reduce or even eliminate control switching by `repairing' the trained policy based on runtime data produced by the safety controller in a way that deviates minimally from the original policy. The key idea behind our approach is the formulation of a trajectory optimization problem that allows the joint reasoning of policy update and safety constraints. Experimental results demonstrate that our approach is effective even when the system model in the safety controller is unknown and only approximated.
引用
收藏
页码:131 / 150
页数:20
相关论文
共 50 条
  • [1] Safety Guided Policy Optimization
    Kim, Dohyeong
    Kim, Yunho
    Lee, Kyungjae
    Oh, Songhwai
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 2462 - 2467
  • [2] Checking and Enforcing Safety: Runtime Verification and Runtime Reflection
    Leucker, Martin
    ERCIM NEWS, 2008, (75): : 35 - 36
  • [3] A runtime framework for system safety
    Papp, Z
    Zoutendijk, A
    IEEE IV2003: INTELLIGENT VEHICLES SYMPOSIUM, PROCEEDINGS, 2003, : 394 - 399
  • [4] Automated runtime repair of business processes
    van Beest, N. R. T. P.
    Kaldeli, E.
    Bulanov, R.
    Wortmann, J. C.
    Lazovik, A.
    INFORMATION SYSTEMS, 2014, 39 : 45 - 79
  • [5] Guided prefetching based on runtime access patterns
    Tao, Jie
    Kneip, Georges
    Karl, Wolfgang
    COMPUTATIONAL SCIENCE - ICCS 2008, PT 3, 2008, 5103 : 268 - +
  • [6] Enforcing Safety at Runtime for Systems with Disturbances
    Abate, Matthew
    Coogan, Samuel
    2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 2038 - 2043
  • [7] Runtime Verification of C Memory Safety
    Rosu, Grigore
    Schulte, Wolfram
    Serbanuta, Traian Florin
    RUNTIME VERIFICATION, 2009, 5779 : 132 - +
  • [8] Runtime Safety Analysis for Safe Reconfiguration
    Priesterjahn, Claudia
    Heinzemann, Christian
    Schaefer, Wilhelm
    Tichy, Matthias
    2012 10TH IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2012, : 1092 - 1097
  • [9] SpecRepair: Counter-Example Guided Safety Repair of Deep Neural Networks
    Bauer-Marquart, Fabian
    Boetius, David
    Leue, Stefan
    Schilling, Christian
    MODEL CHECKING SOFTWARE, SPIN 2022, 2022, 13255 : 79 - 96
  • [10] Runtime Monitoring of Cross-cutting Policy
    Nakajima, Shin
    Ubayashi, Naoyasu
    Hokamura, Keiji
    2009 ICSE WORKSHOP ON ASPECT-ORIENTED REQUIREMENTS ENGINEERING AND ARCHITECTURE DESIGN, 2009, : 20 - +