Task-Agnostic Safety for Reinforcement Learning

被引:1
|
作者
Rahman, Md Asifur [1 ]
Alqahtani, Sarra [1 ]
机构
[1] Wake Forest Univ, Winston Salem, NC 27101 USA
来源
PROCEEDINGS OF THE 16TH ACM WORKSHOP ON ARTIFICIAL INTELLIGENCE AND SECURITY, AISEC 2023 | 2023年
基金
美国国家科学基金会;
关键词
Reinforcement Learning; safety; attacks; robustness;
D O I
10.1145/3605764.3623913
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning (RL) has been an attractive potential for designing autonomous systems due to its learning-by-exploration approach. However, this learning process makes RL inherently vulnerable and thus unsuitable for applications where safety is a top priority. To address this issue, researchers have either jointly optimized task and safety or imposed constraints to restrict exploration. This paper takes a different approach, by utilizing exploration as an adaptive means to learn a robust and safe behavior. To this end, we propose Task-Agnostic Safety for Reinforcement Learning (TAS-RL) framework to ensure safety in RL by learning a representation of unsafe behaviors and excluding them from the agent's policy. TAS-RL is task-agnostic and can be integrated with any RL task policy in the same environment, providing a self-protection layer for the system. To evaluate the robustness of TAS-RL, we present a novel study where TAS-RL and 7 safe RL baselines are tested in constrained Markov decision processes (CMDP) environments under white-box action space perturbations and changes in the environment dynamics. The results show that TAS-RL outperforms all baselines by achieving consistent near-zero safety constraint violations in continuous action space with 10 times more variations in the testing environment dynamics.
引用
收藏
页码:139 / 148
页数:10
相关论文
共 50 条
  • [1] Continual deep reinforcement learning with task-agnostic policy distillation
    Hafez, Muhammad Burhan
    Erekmen, Kerim
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [2] A Task-Agnostic Regularizer for Diverse Subpolicy Discovery in Hierarchical Reinforcement Learning
    Huo, Liangyu
    Wang, Zulin
    Xu, Mai
    Song, Yuhang
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (03): : 1932 - 1944
  • [3] Learning Task-Agnostic Action Spaces for Movement Optimization
    Babadi, Amin
    van de Panne, Michiel
    Liu, C. Karen
    Hamalainen, Perttu
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2022, 28 (12) : 4700 - 4712
  • [4] LEARNING DIVERSE SUB-POLICIES VIA A TASK-AGNOSTIC REGULARIZATION ON ACTION DISTRIBUTIONS
    Huo, Liangyu
    Wang, Zulin
    Xu, Mai
    Song, Yuhang
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3932 - 3936
  • [5] Adversary Agnostic Robust Deep Reinforcement Learning
    Qu, Xinghua
    Gupta, Abhishek
    Ong, Yew-Soon
    Sun, Zhu
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (09) : 6146 - 6157
  • [6] Variable-Agnostic Causal Exploration for Reinforcement Learning
    Minh Hoang Nguyen
    Le, Hung
    Venkatesh, Svetha
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT II, ECML PKDD 2024, 2024, 14942 : 216 - 232
  • [7] Torque-Based Deep Reinforcement Learning for Task-and-Robot Agnostic Learning on Bipedal Robots Using Sim-to-Real Transfer
    Kim, Donghyeon
    Berseth, Glen
    Schwartz, Mathew
    Park, Jaeheung
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (10) : 6251 - 6258
  • [8] Solving Safety Problems with Ensemble Reinforcement Learning
    Ferreira, Leonardo A.
    dos Santos, Thiago F.
    Bianchi, Reinaldo A. C.
    Santos, Paulo E.
    AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 203 - 214
  • [9] Offline reinforcement learning with task hierarchies
    Devin Schwab
    Soumya Ray
    Machine Learning, 2017, 106 : 1569 - 1598
  • [10] Reinforcement Learning for Disassembly Task Control
    Weerasekara, Sachini
    Li, Wei
    Isaacs, Jacqueline
    Kamarthi, Sagar
    COMPUTERS & INDUSTRIAL ENGINEERING, 2024, 190