Metaheuristics-based Exploration Strategies for Multi-Objective Reinforcement Learning

被引:1
|
作者
Felten, Florian [1 ]
Danoy, Gregoire [1 ,2 ]
Talbi, El-Ghazali [3 ]
Bouvry, Pascal [1 ,2 ]
机构
[1] Univ Luxembourg, SnT, Esch Sur Alzette, Luxembourg
[2] Univ Luxembourg, FSTM DCS, Esch Sur Alzette, Luxembourg
[3] Univ Lille, Inria Lille, CNRS CRIStAL, Lille, France
关键词
Reinforcement Learning; Multi-objective; Metaheuristics; Pareto Sets;
D O I
10.5220/0010989100003116
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The fields of Reinforcement Learning (RL) and Optimization aim at finding an optimal solution to a problem, characterized by an objective function. The exploration-exploitation dilemma (EED) is a well known subject in those fields. Indeed, a consequent amount of literature has already been proposed on the subject and shown it is a non-negligible topic to consider to achieve good performances. Yet, many problems in real life involve the optimization of multiple objectives. Multi-Policy Multi-Objective Reinforcement Learning (MPMORL) offers a way to learn various optimised behaviours for the agent in such problems. This work introduces a modular framework for the learning phase of such algorithms, allowing to ease the study of the EED in Inner-Loop MPMORL algorithms. We present three new exploration strategies inspired from the metaheuristics domain. To assess the performance of our methods on various environments, we use a classical benchmark - the Deep Sea Treasure (DST) - as well as propose a harder version of it. Our experiments show all of the proposed strategies outperform the current state-of-the-art e-greedy based methods on the studied benchmarks.
引用
收藏
页码:662 / 673
页数:12
相关论文
共 50 条
  • [21] A multi-objective deep reinforcement learning framework
    Thanh Thi Nguyen
    Ngoc Duy Nguyen
    Vamplew, Peter
    Nahavandi, Saeid
    Dazeley, Richard
    Lim, Chee Peng
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 96
  • [22] A Constrained Multi-Objective Reinforcement Learning Framework
    Huang, Sandy H.
    Abdolmaleki, Abbas
    Vezzani, Giulia
    Brakel, Philemon
    Mankowitz, Daniel J.
    Neunert, Michael
    Bohez, Steven
    Tassa, Yuval
    Heess, Nicolas
    Riedmiller, Martin
    Hadsell, Raia
    CONFERENCE ON ROBOT LEARNING, VOL 164, 2021, 164 : 883 - 893
  • [23] Multi-objective Reinforcement Learning for Responsive Grids
    Julien Perez
    Cécile Germain-Renaud
    Balazs Kégl
    Charles Loomis
    Journal of Grid Computing, 2010, 8 : 473 - 492
  • [24] Pedestrian simulation as multi-objective reinforcement learning
    Ravichandran, Naresh Balaji
    Yang, Fangkai
    Peters, Christopher
    Lansner, Anders
    Herman, Pawel
    18TH ACM INTERNATIONAL CONFERENCE ON INTELLIGENT VIRTUAL AGENTS (IVA'18), 2018, : 307 - 312
  • [25] Hybrid Metaheuristics for Multi-objective Optimization
    Talbi, E-G.
    JOURNAL OF ALGORITHMS & COMPUTATIONAL TECHNOLOGY, 2015, 9 (01) : 41 - 63
  • [26] Accommodating Picky Customers: Regret Bound and Exploration Complexity for Multi-Objective Reinforcement Learning
    Wu, Jingfeng
    Braverman, Vladimir
    Yang, Lin F.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [27] A Multi-Objective Virtual Network Migration Algorithm Based on Reinforcement Learning
    Wang, Desheng
    Zhang, Weizhe
    Han, Xiao
    Lin, Junren
    Tian, Yu-Chu
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2023, 11 (02) : 2039 - 2056
  • [28] SIMULATION BASED MULTI-OBJECTIVE FAB SCHEDULING BY USING REINFORCEMENT LEARNING
    Lee, Won-Jun
    Kim, Byung-Hee
    Ko, Keyhoon
    Shin, Hayong
    2019 WINTER SIMULATION CONFERENCE (WSC), 2019, : 2236 - 2247
  • [29] Multi-Objective Optimization of Cascade Blade Profile Based on Reinforcement Learning
    Qin, Sheng
    Wang, Shuyue
    Wang, Liyue
    Wang, Cong
    Sun, Gang
    Zhong, Yongjian
    APPLIED SCIENCES-BASEL, 2021, 11 (01): : 1 - 27
  • [30] City metro network expansion based on multi-objective reinforcement learning
    Zhang, Liqing
    Hou, Leong
    Ni, Shaoquan
    Chen, Dingjun
    Li, Zhenning
    Wang, Wenxian
    Xian, Weizhi
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2024, 169