Task-Agnostic Safety for Reinforcement Learning

被引：1

作者：

Rahman, Md Asifur ^{[1
]}

Alqahtani, Sarra ^{[1
]}

机构：

[1] Wake Forest Univ, Winston Salem, NC 27101 USA

来源：

PROCEEDINGS OF THE 16TH ACM WORKSHOP ON ARTIFICIAL INTELLIGENCE AND SECURITY, AISEC 2023 | 2023年

基金：

美国国家科学基金会;

关键词：

Reinforcement Learning; safety; attacks; robustness;

D O I：

10.1145/3605764.3623913

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning (RL) has been an attractive potential for designing autonomous systems due to its learning-by-exploration approach. However, this learning process makes RL inherently vulnerable and thus unsuitable for applications where safety is a top priority. To address this issue, researchers have either jointly optimized task and safety or imposed constraints to restrict exploration. This paper takes a different approach, by utilizing exploration as an adaptive means to learn a robust and safe behavior. To this end, we propose Task-Agnostic Safety for Reinforcement Learning (TAS-RL) framework to ensure safety in RL by learning a representation of unsafe behaviors and excluding them from the agent's policy. TAS-RL is task-agnostic and can be integrated with any RL task policy in the same environment, providing a self-protection layer for the system. To evaluate the robustness of TAS-RL, we present a novel study where TAS-RL and 7 safe RL baselines are tested in constrained Markov decision processes (CMDP) environments under white-box action space perturbations and changes in the environment dynamics. The results show that TAS-RL outperforms all baselines by achieving consistent near-zero safety constraint violations in continuous action space with 10 times more variations in the testing environment dynamics.

引用

页码：139 / 148

页数：10

共 50 条

[1] Continual deep reinforcement learning with task-agnostic policy distillation
Hafez, Muhammad Burhan
Erekmen, Kerim
SCIENTIFIC REPORTS, 2024, 14 (01):
[2] A Task-Agnostic Regularizer for Diverse Subpolicy Discovery in Hierarchical Reinforcement Learning
Huo, Liangyu
Wang, Zulin
Xu, Mai
Song, Yuhang
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (03): : 1932 - 1944
[3] Learning Task-Agnostic Action Spaces for Movement Optimization
Babadi, Amin
van de Panne, Michiel
Liu, C. Karen
Hamalainen, Perttu
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2022, 28 (12) : 4700 - 4712
[4] LEARNING DIVERSE SUB-POLICIES VIA A TASK-AGNOSTIC REGULARIZATION ON ACTION DISTRIBUTIONS
Huo, Liangyu
Wang, Zulin
Xu, Mai
Song, Yuhang
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3932 - 3936
[5] Adversary Agnostic Robust Deep Reinforcement Learning
Qu, Xinghua
Gupta, Abhishek
Ong, Yew-Soon
Sun, Zhu
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (09) : 6146 - 6157
[6] Variable-Agnostic Causal Exploration for Reinforcement Learning
Minh Hoang Nguyen
Le, Hung
Venkatesh, Svetha
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT II, ECML PKDD 2024, 2024, 14942 : 216 - 232
[7] Torque-Based Deep Reinforcement Learning for Task-and-Robot Agnostic Learning on Bipedal Robots Using Sim-to-Real Transfer
Kim, Donghyeon
Berseth, Glen
Schwartz, Mathew
Park, Jaeheung
IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (10) : 6251 - 6258
[8] Solving Safety Problems with Ensemble Reinforcement Learning
Ferreira, Leonardo A.
dos Santos, Thiago F.
Bianchi, Reinaldo A. C.
Santos, Paulo E.
AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 203 - 214
[9] Offline reinforcement learning with task hierarchies
Devin Schwab
Soumya Ray
Machine Learning, 2017, 106 : 1569 - 1598
[10] Reinforcement Learning for Disassembly Task Control
Weerasekara, Sachini
Li, Wei
Isaacs, Jacqueline
Kamarthi, Sagar
COMPUTERS & INDUSTRIAL ENGINEERING, 2024, 190

← 1 2 3 4 5 →