Metaheuristics-based Exploration Strategies for Multi-Objective Reinforcement Learning

被引：1

作者：

Felten, Florian ^{[1
]}

Danoy, Gregoire ^{[1
,2
]}

Talbi, El-Ghazali ^{[3
]}

Bouvry, Pascal ^{[1
,2
]}

机构：

[1] Univ Luxembourg, SnT, Esch Sur Alzette, Luxembourg

[2] Univ Luxembourg, FSTM DCS, Esch Sur Alzette, Luxembourg

[3] Univ Lille, Inria Lille, CNRS CRIStAL, Lille, France

来源：

ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 2 | 2022年

关键词：

Reinforcement Learning; Multi-objective; Metaheuristics; Pareto Sets;

D O I：

10.5220/0010989100003116

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The fields of Reinforcement Learning (RL) and Optimization aim at finding an optimal solution to a problem, characterized by an objective function. The exploration-exploitation dilemma (EED) is a well known subject in those fields. Indeed, a consequent amount of literature has already been proposed on the subject and shown it is a non-negligible topic to consider to achieve good performances. Yet, many problems in real life involve the optimization of multiple objectives. Multi-Policy Multi-Objective Reinforcement Learning (MPMORL) offers a way to learn various optimised behaviours for the agent in such problems. This work introduces a modular framework for the learning phase of such algorithms, allowing to ease the study of the EED in Inner-Loop MPMORL algorithms. We present three new exploration strategies inspired from the metaheuristics domain. To assess the performance of our methods on various environments, we use a classical benchmark - the Deep Sea Treasure (DST) - as well as propose a harder version of it. Our experiments show all of the proposed strategies outperform the current state-of-the-art e-greedy based methods on the studied benchmarks.

引用

页码：662 / 673

页数：12

共 50 条

[1] Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning
Horie, Naoto
Matsui, Tohgoroh
Moriyama, Koichi
Mutoh, Atsuko
Inuzuka, Nobuhiro
ARTIFICIAL LIFE AND ROBOTICS, 2019, 24 (03) : 352 - 359
[2] Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning
Naoto Horie
Tohgoroh Matsui
Koichi Moriyama
Atsuko Mutoh
Nobuhiro Inuzuka
Artificial Life and Robotics, 2019, 24 : 352 - 359
[3] Metaheuristics-based Multi-objective Design of Global Robust Optimal Sliding Mode Control of Discrete Uncertain Systems
Wafa Boukadida
Anouar Benamor
Hassani Messaoud
Patrick Siarry
International Journal of Control, Automation and Systems, 2019, 17 : 1378 - 1392
[4] Decomposition based Multi-Objective Evolutionary Algorithm in XCS for Multi-Objective Reinforcement Learning
Cheng, Xiu
Browne, Will N.
Zhang, Mengjie
2018 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2018, : 622 - 629
[5] Metaheuristics-based Multi-objective Design of Global Robust Optimal Sliding Mode Control of Discrete Uncertain Systems
Boukadida, Wafa
Benamor, Anouar
Messaoud, Hassani
Siarry, Patrick
INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2019, 17 (06) : 1378 - 1392
[6] Model-Based Multi-Objective Reinforcement Learning
Wiering, Marco A.
Withagen, Maikel
Drugan, Madalina M.
2014 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING (ADPRL), 2014, : 111 - 116
[7] Hypervolume-Based Multi-Objective Reinforcement Learning
Van Moffaert, Kristof
Drugan, Madalina M.
Nowe, Ann
EVOLUTIONARY MULTI-CRITERION OPTIMIZATION, EMO 2013, 2013, 7811 : 352 - 366
[8] Multi-objective ω-Regular Reinforcement Learning
Hahn, Ernst Moritz
Perez, Mateo
Schewe, Sven
Somenzi, Fabio
Trivedi, Ashutosh
Wojtczak, Dominik
FORMAL ASPECTS OF COMPUTING, 2023, 35 (02)
[9] Federated multi-objective reinforcement learning
Zhao, Fangyuan
Ren, Xuebin
Yang, Shusen
Zhao, Peng
Zhang, Rui
Xu, Xinxin
INFORMATION SCIENCES, 2023, 624 : 811 - 832
[10] Multi-Objective Optimisation by Reinforcement Learning
Liao, H. L.
Wu, Q. H.
2010 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2010,

← 1 2 3 4 5 →