Actor-critic multi-objective reinforcement learning for non-linear utility functions

被引：5

作者：

Reymond, Mathieu ^{[1
]}

Hayes, Conor F. ^{[2
]}

Steckelmacher, Denis ^{[1
]}

Roijers, Diederik M. ^{[1
,3
]}

Nowe, Ann ^{[1
]}

机构：

[1] Vrije Univ Brussel, Brussels, Belgium

[2] Univ Galway, Galway, Ireland

[3] HU Univ Appl Sci Utrecht, Utrecht, Netherlands

来源：

AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS | 2023年 / 37卷 / 02期

关键词：

Reinforcement learning; Multi-objective reinforcement learning; Non-linear utility functions; Expected scalarized return; SETS;

D O I：

10.1007/s10458-023-09604-x

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We propose a novel multi-objective reinforcement learning algorithm that successfully learns the optimal policy even for non-linear utility functions. Non-linear utility functions pose a challenge for SOTA approaches, both in terms of learning efficiency as well as the solution concept. A key insight is that, by proposing a critic that learns a multi-variate distribution over the returns, which is then combined with accumulated rewards, we can directly optimize on the utility function, even if it is non-linear. This allows us to vastly increase the range of problems that can be solved compared to those which can be handled by single-objective methods or multi-objective methods requiring linear utility functions, yet avoiding the need to learn the full Pareto front. We demonstrate our method on multiple multi-objective benchmarks, and show that it learns effectively where baseline approaches fail.

引用

页数：30

共 50 条

[21] An Actor-Critic Reinforcement Learning Control Approach for Discrete-Time Linear System with Uncertainty
Chen, Hsin-Chang
Lin, Yu-Chen
Chang, Yu-Heng
2018 INTERNATIONAL AUTOMATIC CONTROL CONFERENCE (CACS), 2018,
[22] Lexicographic Actor-Critic Deep Reinforcement Learning for Urban Autonomous Driving
Zhang, Hengrui
Lin, Youfang
Han, Sheng
Lv, Kai
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (04) : 4308 - 4319
[23] Influences of Reinforcement and Choice Histories on Choice Behavior in Actor-Critic Learning
Katahira K.
Kimura K.
Computational Brain & Behavior, 2023, 6 (2) : 172 - 194
[24] An Actor-Critic Reinforcement Learning Approach for Energy Harvesting Communications Systems
Masadeh, Ala'eddin
Wang, Zhengdao
Kamal, Ahmed E.
2019 28TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND NETWORKS (ICCCN), 2019,
[25] On the sample complexity of actor-critic method for reinforcement learning with function approximation
Kumar, Harshat
Koppel, Alec
Ribeiro, Alejandro
MACHINE LEARNING, 2023, 112 (07) : 2433 - 2467
[26] Automated State Feature Learning for Actor-Critic Reinforcement Learning through NEAT
Peng, Yiming
Chen, Gang
Holdaway, Scott
Mei, Yi
Zhang, Mengjie
PROCEEDINGS OF THE 2017 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCO'17 COMPANION), 2017, : 135 - 136
[27] Actor-Critic Reinforcement Learning for Automatic Left Atrial Appendage Segmentation
Abdullah, Al Walid
Yun, Il Dong
PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 609 - 612
[28] Supervised actor-critic reinforcement learning with action feedback for algorithmic trading
Qizhou Sun
Yain-Whar Si
Applied Intelligence, 2023, 53 : 16875 - 16892
[29] On the sample complexity of actor-critic method for reinforcement learning with function approximation
Harshat Kumar
Alec Koppel
Alejandro Ribeiro
Machine Learning, 2023, 112 : 2433 - 2467
[30] USING ACTOR-CRITIC REINFORCEMENT LEARNING FOR CONTROL AND FLIGHT FORMATION OF QUADROTORS
Torres, Edgar
Xu, Lei
Sardarmehni, Tohid
PROCEEDINGS OF ASME 2022 INTERNATIONAL MECHANICAL ENGINEERING CONGRESS AND EXPOSITION, IMECE2022, VOL 5, 2022,

← 1 2 3 4 5 →