Metaheuristics-based Exploration Strategies for Multi-Objective Reinforcement Learning

被引:1
|
作者
Felten, Florian [1 ]
Danoy, Gregoire [1 ,2 ]
Talbi, El-Ghazali [3 ]
Bouvry, Pascal [1 ,2 ]
机构
[1] Univ Luxembourg, SnT, Esch Sur Alzette, Luxembourg
[2] Univ Luxembourg, FSTM DCS, Esch Sur Alzette, Luxembourg
[3] Univ Lille, Inria Lille, CNRS CRIStAL, Lille, France
关键词
Reinforcement Learning; Multi-objective; Metaheuristics; Pareto Sets;
D O I
10.5220/0010989100003116
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The fields of Reinforcement Learning (RL) and Optimization aim at finding an optimal solution to a problem, characterized by an objective function. The exploration-exploitation dilemma (EED) is a well known subject in those fields. Indeed, a consequent amount of literature has already been proposed on the subject and shown it is a non-negligible topic to consider to achieve good performances. Yet, many problems in real life involve the optimization of multiple objectives. Multi-Policy Multi-Objective Reinforcement Learning (MPMORL) offers a way to learn various optimised behaviours for the agent in such problems. This work introduces a modular framework for the learning phase of such algorithms, allowing to ease the study of the EED in Inner-Loop MPMORL algorithms. We present three new exploration strategies inspired from the metaheuristics domain. To assess the performance of our methods on various environments, we use a classical benchmark - the Deep Sea Treasure (DST) - as well as propose a harder version of it. Our experiments show all of the proposed strategies outperform the current state-of-the-art e-greedy based methods on the studied benchmarks.
引用
收藏
页码:662 / 673
页数:12
相关论文
共 50 条
  • [41] Urban Driving with Multi-Objective Deep Reinforcement Learning
    Li, Changjian
    Czarnecki, Krzysztof
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 359 - 367
  • [42] A temporal difference method for multi-objective reinforcement learning
    Ruiz-Montiel, Manuela
    Mandow, Lawrence
    Perez-de-la-Cruz, Jose-Luis
    NEUROCOMPUTING, 2017, 263 : 15 - 25
  • [43] Reinforcement learning-based energy management strategies of fuel cell hybrid vehicles with multi-objective control
    Zheng, Chunhua
    Zhang, Dongfang
    Xiao, Yao
    Li, Wei
    JOURNAL OF POWER SOURCES, 2022, 543
  • [44] Multi-Objective Order Scheduling via Reinforcement Learning
    Chen, Sirui
    Tian, Yuming
    An, Lingling
    ALGORITHMS, 2023, 16 (11)
  • [45] Multi-objective multicast optimization with deep reinforcement learning
    Li, Xiaole
    Tian, Jinwei
    Wang, Cuiping
    Jiang, Yinghui
    Wang, Xing
    Wang, Jiuru
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2025, 28 (04):
  • [46] Dynamic Weights in Multi-Objective Deep Reinforcement Learning
    Abels, Axel
    Roijers, Diederik M.
    Lenaerts, Tom
    Nowe, Ann
    Steckelmacher, Denis
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [47] Multi-objective reinforcement learning approach for trip recommendation
    Chen, Lei
    Zhu, Guixiang
    Liang, Weichao
    Wang, Youquan
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 226
  • [48] Taming Lagrangian chaos with multi-objective reinforcement learning
    Chiara Calascibetta
    Luca Biferale
    Francesco Borra
    Antonio Celani
    Massimo Cencini
    The European Physical Journal E, 2023, 46
  • [49] Evolutionary Reinforcement Learning for Multi-objective SFC Deployment
    Zhao, Jialiang
    Wang, Ran
    Wu, Qiang
    Hao, Jie
    Xiong, Zehui
    2024 IEEE 21ST INTERNATIONAL CONFERENCE ON MOBILE AD-HOC AND SMART SYSTEMS, MASS 2024, 2024, : 212 - 218
  • [50] A reinforcement learning approach for dynamic multi-objective optimization
    Zou, Fei
    Yen, Gary G.
    Tang, Lixin
    Wang, Chunfeng
    INFORMATION SCIENCES, 2021, 546 : 815 - 834