Multi-objective ω-Regular Reinforcement Learning

被引:3
|
作者
Hahn, Ernst Moritz [1 ]
Perez, Mateo [2 ]
Schewe, Sven [3 ,4 ]
Somenzi, Fabio
Trivedi, Ashutosh [2 ]
Wojtczak, Dominik [3 ]
机构
[1] Univ Twente, Fac Elect Engn Math & Comp Sci, Enschede, Netherlands
[2] Univ Colorado Boulder, Dept Comp Sci, Boulder, CO USA
[3] Univ Liverpool, Dept Comp Sci, Liverpool, Merseyside, England
[4] Univ Colorado Boulder, Dept Elect Comp & Energy Engn, Boulder, CO USA
基金
欧盟地平线“2020”; 英国工程与自然科学研究理事会; 美国国家科学基金会;
关键词
Multi-objective reinforcement learning; omega-regular objectives; lexicographic preference; weighted preference; automata-theoretic reinforcement learning; MARKOV DECISION-PROCESSES; STOCHASTIC GAMES; MODEL CHECKING; DOPAMINE; LEVEL;
D O I
10.1145/3605950
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The expanding role of reinforcement learning (RL) in safety-critical system design has promoted omega-automata as a way to express learning requirements-often non-Markovian-with greater ease of expression and interpretation than scalar reward signals. However, real-world sequential decision making situations often involve multiple, potentially conflicting, objectives. Two dominant approaches to express relative preferences over multiple objectives are: (1) weighted preference, where the decision maker provides scalar weights for various objectives, and (2) lexicographic preference, where the decision maker provides an order over the objectives such that any amount of satisfaction of a higher-ordered objective is preferable to any amount of a lower-ordered one. In this article, we study and develop RL algorithms to compute optimal strategies in Markov decision processes against multiple omega-regular objectives under weighted and lexicographic preferences. We provide a translation from multiple omega-regular objectives to a scalar reward signal that is both faithful (maximising reward means maximising probability of achieving the objectives under the corresponding preference) and effective (RL quickly converges to optimal strategies). We have implemented the translations in a formal reinforcement learning tool, MUNGOJERRIE, and we present an experimental evaluation of our technique on benchmark learning problems.
引用
收藏
页数:24
相关论文
共 50 条
  • [21] Multi-objective multicast optimization with deep reinforcement learning
    Li, Xiaole
    Tian, Jinwei
    Wang, Cuiping
    Jiang, Yinghui
    Wang, Xing
    Wang, Jiuru
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2025, 28 (04):
  • [22] Dynamic Weights in Multi-Objective Deep Reinforcement Learning
    Abels, Axel
    Roijers, Diederik M.
    Lenaerts, Tom
    Nowe, Ann
    Steckelmacher, Denis
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [23] Multi-objective reinforcement learning approach for trip recommendation
    Chen, Lei
    Zhu, Guixiang
    Liang, Weichao
    Wang, Youquan
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 226
  • [24] Model-Based Multi-Objective Reinforcement Learning
    Wiering, Marco A.
    Withagen, Maikel
    Drugan, Madalina M.
    2014 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING (ADPRL), 2014, : 111 - 116
  • [25] Taming Lagrangian chaos with multi-objective reinforcement learning
    Chiara Calascibetta
    Luca Biferale
    Francesco Borra
    Antonio Celani
    Massimo Cencini
    The European Physical Journal E, 2023, 46
  • [26] Evolutionary Reinforcement Learning for Multi-objective SFC Deployment
    Zhao, Jialiang
    Wang, Ran
    Wu, Qiang
    Hao, Jie
    Xiong, Zehui
    2024 IEEE 21ST INTERNATIONAL CONFERENCE ON MOBILE AD-HOC AND SMART SYSTEMS, MASS 2024, 2024, : 212 - 218
  • [27] A reinforcement learning approach for dynamic multi-objective optimization
    Zou, Fei
    Yen, Gary G.
    Tang, Lixin
    Wang, Chunfeng
    INFORMATION SCIENCES, 2021, 546 : 815 - 834
  • [28] A practical guide to multi-objective reinforcement learning and planning
    Hayes, Conor F.
    Radulescu, Roxana
    Bargiacchi, Eugenio
    Kallstrom, Johan
    Macfarlane, Matthew
    Reymond, Mathieu
    Verstraeten, Timothy
    Zintgraf, Luisa M.
    Dazeley, Richard
    Heintz, Fredrik
    Howley, Enda
    Irissappane, Athirai A.
    Mannion, Patrick
    Nowe, Ann
    Ramos, Gabriel
    Restelli, Marcello
    Vamplew, Peter
    Roijers, Diederik M.
    AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2022, 36 (01)
  • [29] Multi-objective Genetic Programming for Explainable Reinforcement Learning
    Videau, Mathurin
    Leite, Alessandro
    Teytaud, Olivier
    Schoenauer, Marc
    GENETIC PROGRAMMING (EUROGP 2022), 2022, : 278 - 293
  • [30] Multi-Objective Optimization in Disaster Backup with Reinforcement Learning
    Yi, Shanwen
    Qin, Yao
    Wang, Hua
    MATHEMATICS, 2025, 13 (03)