Policy Learning with Constraints in Model-free Reinforcement Learning: A Survey

被引:0
|
作者
Liu, Yongshuai [1 ]
Halev, Avishai [1 ,2 ]
Liu, Xin [1 ]
机构
[1] Univ Calif Davis, Davis, CA 95616 USA
[2] Total E&P R&T USA, Houston, TX USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement Learning (RL) algorithms have had tremendous success in simulated domains. These algorithms, however, often cannot be directly applied to physical systems, especially in cases where there are constraints to satisfy (e.g. to ensure safety or limit resource consumption). In standard RL, the agent is incentivized to explore any policy with the sole goal of maximizing reward; in the real world, however, ensuring satisfaction of certain constraints in the process is also necessary and essential. In this article, we overview existing approaches addressing constraints in model-free reinforcement learning. We model the problem of learning with constraints as a Constrained Markov Decision Process and consider two main types of constraints: cumulative and instantaneous. We summarize existing approaches and discuss their pros and cons. To evaluate policy performance under constraints, we introduce a set of standard benchmarks and metrics. We also summarize limitations of current methods and present open questions for future research.
引用
收藏
页码:4508 / 4515
页数:8
相关论文
共 50 条
  • [1] Model-Free Reinforcement Learning Algorithms: A Survey
    Calisir, Sinan
    Pehlivanoglu, Meltem Kurt
    2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2019,
  • [2] Model-free reinforcement learning from expert demonstrations: a survey
    Jorge Ramírez
    Wen Yu
    Adolfo Perrusquía
    Artificial Intelligence Review, 2022, 55 : 3213 - 3241
  • [3] Model-free reinforcement learning from expert demonstrations: a survey
    Ramirez, Jorge
    Yu, Wen
    Perrusquia, Adolfo
    ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (04) : 3213 - 3241
  • [4] Learning Representations in Model-Free Hierarchical Reinforcement Learning
    Rafati, Jacob
    Noelle, David C.
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 10009 - 10010
  • [5] Model-free Reinforcement Learning of Semantic Communication by Stochastic Policy Gradient
    Beck, Edgar
    Bockelmann, Carsten
    Dekorsy, Armin
    2024 IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING FOR COMMUNICATION AND NETWORKING, ICMLCN 2024, 2024, : 367 - 373
  • [6] Model-free off-policy reinforcement learning in continuous environment
    Wawrzynski, P
    Pacut, A
    2004 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2004, : 1091 - 1096
  • [7] Model-Free Trajectory Optimization for Reinforcement Learning
    Akrour, Riad
    Abdolmaleki, Abbas
    Abdulsamad, Hany
    Neumann, Gerhard
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [8] Model-Free Active Exploration in Reinforcement Learning
    Russo, Alessio
    Proutiere, Alexandre
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [9] Model-Free Quantum Control with Reinforcement Learning
    Sivak, V. V.
    Eickbusch, A.
    Liu, H.
    Royer, B.
    Tsioutsios, I
    Devoret, M. H.
    PHYSICAL REVIEW X, 2022, 12 (01)
  • [10] Online Nonstochastic Model-Free Reinforcement Learning
    Ghai, Udaya
    Gupta, Arushi
    Xia, Wenhan
    Singh, Karan
    Hazan, Elad
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,