Probabilistic Guarantees for Safe Deep Reinforcement Learning

被引：15

作者：

Bacci, Edoardo ^{[1
]}

Parker, David ^{[1
]}

机构：

[1] Univ Birmingham, Birmingham, England

来源：

FORMAL MODELING AND ANALYSIS OF TIMED SYSTEMS, FORMATS 2020 | 2020年 / 12288卷

基金：

欧洲研究理事会;

关键词：

D O I：

10.1007/978-3-030-57628-8_14

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Deep reinforcement learning has been successfully applied to many control tasks, but the application of such controllers in safety-critical scenarios has been limited due to safety concerns. Rigorous testing of these controllers is challenging, particularly when they operate in probabilistic environments due to, for example, hardware faults or noisy sensors. We propose MOSAIC, an algorithm for measuring the safety of deep reinforcement learning controllers in stochastic settings. Our approach is based on the iterative construction of a formal abstraction of a controller's execution in an environment, and leverages probabilistic model checking of Markov decision processes to produce probabilistic guarantees on safe behaviour over a finite time horizon. It produces bounds on the probability of safe operation of the controller for different initial configurations and identifies regions where correct behaviour can be guaranteed. We implement and evaluate our approach on controllers trained for several benchmark control problems.

引用

页码：231 / 248

页数：18

共 50 条

[1] Safe Reinforcement Learning with Probabilistic Guarantees Satisfying Temporal Logic Specifications in Continuous Action Spaces
Krasowski, Hanna
Akella, Prithvi
Ames, Aaron D.
Althoff, Matthias
2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 4372 - 4378
[2] Probabilistic Policy Reuse for Safe Reinforcement Learning
Garcia, Javier
Fernandez, Fernando
ACM TRANSACTIONS ON AUTONOMOUS AND ADAPTIVE SYSTEMS, 2019, 13 (03)
[3] Safe Model-based Reinforcement Learning with Stability Guarantees
Berkenkamp, Felix
Turchetta, Matteo
Schoellig, Angela P.
Krause, Andreas
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[4] Enumerating Safe Regions in Deep Neural Networks with Provable Probabilistic Guarantees
Marzari, Luca
Corsi, Davide
Marchesini, Enrico
Farinelli, Alessandro
Cicalese, Ferdinando
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 19, 2024, : 21387 - 21394
[5] Deep reinforcement learning for partial offloading with reliability guarantees
Li, Ji
Chen, Zewei
Liu, Xin
19TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2021), 2021, : 1027 - 1034
[6] Reinforcement Learning for Temporal Logic Control Synthesis with Probabilistic Satisfaction Guarantees
Hasanbeig, M.
Kantaros, Y.
Abate, A.
Kroening, D.
Pappas, G. J.
Lee, I.
2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 5338 - 5343
[7] Safe Reinforcement Learning via Probabilistic Logic Shields
Yang, Wen-Chi
Marra, Giuseppe
Rens, Gavin
De Raedt, Luc
PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 5739 - 5749
[8] Safe Reinforcement Learning via Probabilistic Logic Shields
Yang, Wen-Chi
Marra, Giuseppe
Rens, Gavin
De Raedt, Luc
NEURAL-SYMBOLIC LEARNING AND REASONING 2023, NESY 2023, 2023,
[9] Verified Probabilistic Policies for Deep Reinforcement Learning
Bacci, Edoardo
Parker, David
NASA FORMAL METHODS (NFM 2022), 2022, 13260 : 193 - 212
[10] Skill Reward for Safe Deep Reinforcement Learning
Cheng, Jiangchang
Yu, Fumin
Zhang, Hongliang
Dai, Yinglong
UBIQUITOUS SECURITY, 2022, 1557 : 203 - 213

← 1 2 3 4 5 →