Efficient Elitist Cooperative Evolutionary Algorithm for Multi-Objective Reinforcement Learning

被引:5
作者
Zhou, Dan [1 ]
Du, Jiqing [1 ]
Arai, Sachiyo [1 ]
机构
[1] Chiba Univ, Grad Sch Sci & Engn, Dept Urban Environm Syst, Div Earth & Environm Sci, Chiba 2638522, Japan
基金
日本学术振兴会;
关键词
Pareto optimization; Statistics; Social factors; Underwater vehicles; Measurement; Q-learning; Uncertainty; Reinforcement learning; Multi-objective reinforcement learning; efficient; cooperative; Pareto front; elite archive; GENETIC ALGORITHM;
D O I
10.1109/ACCESS.2023.3272115
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Sequential decision-making problems with multiple objectives are known as multi-objective reinforcement learning. In these scenarios, decision-makers require a complete Pareto front that consists of Pareto optimal solutions. Such a front enables decision-makers to understand the relationship between objectives and make informed decisions from a broad range of solutions. However, existing methods may be unable to search for solutions in concave regions of the Pareto front or lack global optimization ability, leading to incomplete Pareto fronts. To address this issue, we propose an efficient elitist cooperative evolutionary algorithm that maintains both an evolving population and an elite archive. The elite archive uses cooperative operations with various genetic operators to guide the evolving population, resulting in efficient searches for Pareto optimal solutions. The experimental results on submarine treasure hunting benchmarks demonstrate the effectiveness of the proposed method in solving various multi-objective reinforcement learning problems and providing decision-makers with a set of trade-off solutions between travel time and treasure amount, enabling them to make flexible and informed decisions based on their preferences. Therefore, the proposed method has the potential to be a useful tool for implementing real-world applications.
引用
收藏
页码:43128 / 43139
页数:12
相关论文
共 50 条
  • [41] A new multi-objective evolutionary algorithm for solving high complex multi-objective problems
    Li, Kangshun
    Yue, Xuezhi
    Kang, Lishan
    Chen, Zhangxin
    GECCO 2006: GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, VOL 1 AND 2, 2006, : 745 - +
  • [42] Multi-objective Reinforcement Learning for Responsive Grids
    Julien Perez
    Cécile Germain-Renaud
    Balazs Kégl
    Charles Loomis
    Journal of Grid Computing, 2010, 8 : 473 - 492
  • [43] Pedestrian simulation as multi-objective reinforcement learning
    Ravichandran, Naresh Balaji
    Yang, Fangkai
    Peters, Christopher
    Lansner, Anders
    Herman, Pawel
    18TH ACM INTERNATIONAL CONFERENCE ON INTELLIGENT VIRTUAL AGENTS (IVA'18), 2018, : 307 - 312
  • [44] Neuroevolutionary diversity policy search for multi-objective reinforcement learning
    Zhou, Dan
    Du, Jiqing
    Arai, Sachiyo
    INFORMATION SCIENCES, 2024, 657
  • [45] Multi-Objective Optimization Using Adaptive Distributed Reinforcement Learning
    Tan, Jing
    Khalili, Ramin
    Karl, Holger
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (09) : 10777 - 10789
  • [46] An Efficient Hybrid Evolutionary Optimization Algorithm for Multi-Objective Distribution Feeder Reconfiguration
    Khorshidi, Reza
    INTERNATIONAL REVIEW OF ELECTRICAL ENGINEERING-IREE, 2009, 4 (06): : 1318 - 1325
  • [47] Multi-objective Reinforcement Learning for Responsive Grids
    Perez, Julien
    Germain-Renaud, Cecile
    Kegl, Balazs
    Loomis, Charles
    JOURNAL OF GRID COMPUTING, 2010, 8 (03) : 473 - 492
  • [48] Learning-based multi-objective evolutionary algorithm for batching decision problem
    Meng, Ying
    Li, Tianyang
    Tang, Lixin
    COMPUTERS & OPERATIONS RESEARCH, 2023, 149
  • [49] An elitist multi-objective particle swarm optimization algorithm for composite structures design
    Fitas, Ricardo
    Carneiro, Goncalo das Neves
    Antonio, Carlos Conceicao
    COMPOSITE STRUCTURES, 2022, 300
  • [50] Survey of multi-objective evolutionary algorithm based on genetic algorithm
    Li Li
    Pan Feng
    PROCEEDINGS OF THE 2007 CHINESE CONTROL AND DECISION CONFERENCE, 2007, : 363 - 366