Probabilistic Guarantees for Safe Deep Reinforcement Learning

被引:15
|
作者
Bacci, Edoardo [1 ]
Parker, David [1 ]
机构
[1] Univ Birmingham, Birmingham, England
来源
FORMAL MODELING AND ANALYSIS OF TIMED SYSTEMS, FORMATS 2020 | 2020年 / 12288卷
基金
欧洲研究理事会;
关键词
D O I
10.1007/978-3-030-57628-8_14
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Deep reinforcement learning has been successfully applied to many control tasks, but the application of such controllers in safety-critical scenarios has been limited due to safety concerns. Rigorous testing of these controllers is challenging, particularly when they operate in probabilistic environments due to, for example, hardware faults or noisy sensors. We propose MOSAIC, an algorithm for measuring the safety of deep reinforcement learning controllers in stochastic settings. Our approach is based on the iterative construction of a formal abstraction of a controller's execution in an environment, and leverages probabilistic model checking of Markov decision processes to produce probabilistic guarantees on safe behaviour over a finite time horizon. It produces bounds on the probability of safe operation of the controller for different initial configurations and identifies regions where correct behaviour can be guaranteed. We implement and evaluate our approach on controllers trained for several benchmark control problems.
引用
收藏
页码:231 / 248
页数:18
相关论文
共 50 条
  • [1] Safe Reinforcement Learning with Probabilistic Guarantees Satisfying Temporal Logic Specifications in Continuous Action Spaces
    Krasowski, Hanna
    Akella, Prithvi
    Ames, Aaron D.
    Althoff, Matthias
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 4372 - 4378
  • [2] Probabilistic Policy Reuse for Safe Reinforcement Learning
    Garcia, Javier
    Fernandez, Fernando
    ACM TRANSACTIONS ON AUTONOMOUS AND ADAPTIVE SYSTEMS, 2019, 13 (03)
  • [3] Safe Model-based Reinforcement Learning with Stability Guarantees
    Berkenkamp, Felix
    Turchetta, Matteo
    Schoellig, Angela P.
    Krause, Andreas
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [4] Enumerating Safe Regions in Deep Neural Networks with Provable Probabilistic Guarantees
    Marzari, Luca
    Corsi, Davide
    Marchesini, Enrico
    Farinelli, Alessandro
    Cicalese, Ferdinando
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 19, 2024, : 21387 - 21394
  • [5] Deep reinforcement learning for partial offloading with reliability guarantees
    Li, Ji
    Chen, Zewei
    Liu, Xin
    19TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2021), 2021, : 1027 - 1034
  • [6] Reinforcement Learning for Temporal Logic Control Synthesis with Probabilistic Satisfaction Guarantees
    Hasanbeig, M.
    Kantaros, Y.
    Abate, A.
    Kroening, D.
    Pappas, G. J.
    Lee, I.
    2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 5338 - 5343
  • [7] Safe Reinforcement Learning via Probabilistic Logic Shields
    Yang, Wen-Chi
    Marra, Giuseppe
    Rens, Gavin
    De Raedt, Luc
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 5739 - 5749
  • [8] Safe Reinforcement Learning via Probabilistic Logic Shields
    Yang, Wen-Chi
    Marra, Giuseppe
    Rens, Gavin
    De Raedt, Luc
    NEURAL-SYMBOLIC LEARNING AND REASONING 2023, NESY 2023, 2023,
  • [9] Verified Probabilistic Policies for Deep Reinforcement Learning
    Bacci, Edoardo
    Parker, David
    NASA FORMAL METHODS (NFM 2022), 2022, 13260 : 193 - 212
  • [10] Skill Reward for Safe Deep Reinforcement Learning
    Cheng, Jiangchang
    Yu, Fumin
    Zhang, Hongliang
    Dai, Yinglong
    UBIQUITOUS SECURITY, 2022, 1557 : 203 - 213