A Review of Safe Reinforcement Learning: Methods, Theories, and Applications

被引:4
|
作者
Gu, Shangding [1 ]
Yang, Long [3 ]
Du, Yali [4 ]
Chen, Guang [5 ]
Walter, Florian [2 ]
Wang, Jun [6 ]
Knoll, Alois [2 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
[2] Tech Univ Munich, Dept Informat, D-85748 Munich, Germany
[3] Peking Univ, Inst AI, Beijing 100871, Peoples R China
[4] Kings Coll London, Dept Informat, London WC1E 6EB, England
[5] Tongji Univ, Dept Comp Sci & Technol, Shanghai 201804, Peoples R China
[6] UCL, Dept Comp Sci, London WC1E 6BT, England
基金
中国国家自然科学基金;
关键词
Safe reinforcement learning (RL); safety optimisation; constrained Markov decision processes; safety problems; MARKOV DECISION-PROCESSES; ACTOR-CRITIC ALGORITHM; APPROXIMATION; MODEL; NETWORKS; POLICIES; CHAINS;
D O I
10.1109/TPAMI.2024.3457538
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement Learning (RL) has achieved tremendous success in many complex decision-making tasks. However, safety concerns are raised during deploying RL in real-world applications, leading to a growing demand for safe RL algorithms, such as in autonomous driving and robotics scenarios. While safe control has a long history, the study of safe RL algorithms is still in the early stages. To establish a good foundation for future safe RL research, in this paper, we provide a review of safe RL from the perspectives of methods, theories, and applications. First, we review the progress of safe RL from five dimensions and come up with five crucial problems for safe RL being deployed in real-world applications, coined as "2H3W". Second, we analyze the algorithm and theory progress from the perspectives of answering the "2H3W" problems. Particularly, the sample complexity of safe RL algorithms is reviewed and discussed, followed by an introduction to the applications and benchmarks of safe RL algorithms. Finally, we open the discussion of the challenging problems in safe RL, hoping to inspire future research on this thread. To advance the study of safe RL algorithms, we release an open-sourced repository containing major safe RL algorithms at the link.
引用
收藏
页码:11216 / 11235
页数:20
相关论文
共 50 条
  • [21] Applications of Reinforcement Learning for maintenance of engineering systems: A review
    Marugan, Alberto Pliego
    ADVANCES IN ENGINEERING SOFTWARE, 2023, 183
  • [22] A bibliometric analysis and review on reinforcement learning for transportation applications
    Li, Can
    Bai, Lei
    Yao, Lina
    Waller, S. Travis
    Liu, Wei
    TRANSPORTMETRICA B-TRANSPORT DYNAMICS, 2023, 11 (01)
  • [23] Applications of deep reinforcement learning in nuclear energy: A review
    Liu, Yongchao
    Wang, Bo
    Tan, Sichao
    Li, Tong
    Lv, Wei
    Niu, Zhenfeng
    Li, Jiangkuan
    Gao, Puzhen
    Tian, Ruifeng
    NUCLEAR ENGINEERING AND DESIGN, 2024, 429
  • [24] Safe Reinforcement Learning and Adaptive Optimal Control With Applications to Obstacle Avoidance Problem
    Wang, Ke
    Mu, Chaoxu
    Ni, Zhen
    Liu, Derong
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, 21 (03) : 4599 - 4612
  • [25] The Applicability of Reinforcement Learning Methods in the Development of Industry 4.0 Applications
    Kegyes, Tamas
    Sule, Zoltan
    Abonyi, Janos
    COMPLEXITY, 2021, 2021
  • [26] Surface modification of calcium carbonate: A review of theories, methods and applications
    Li, Chun-quan
    Liang, Chao
    Chen, Zhen-ming
    Di, Yong-hao
    Zheng, Shui-lin
    Wei, Shi
    Sun, Zhi-ming
    JOURNAL OF CENTRAL SOUTH UNIVERSITY, 2021, 28 (09) : 2589 - 2611
  • [27] What Is Acceptably Safe for Reinforcement Learning?
    Bragg, John
    Habli, Ibrahim
    COMPUTER SAFETY, RELIABILITY, AND SECURITY, SAFECOMP 2018, 2018, 11094 : 418 - 430
  • [28] A comprehensive survey on safe reinforcement learning
    García, Javier
    Fernández, Fernando
    Journal of Machine Learning Research, 2015, 16 : 1437 - 1480
  • [29] Safe Reinforcement Learning for Sepsis Treatment
    Jia, Yan
    Burden, John
    Lawton, Tom
    Habli, Ibrahim
    2020 8TH IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2020), 2020, : 108 - 114
  • [30] Lyapunov design for safe reinforcement learning
    Perkins, TJ
    Barto, AG
    JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 803 - 832