Optimizing parameters in swarm intelligence using reinforcement learning: An application of Proximal Policy Optimization to the iSOMA algorithm

被引:10
作者
Klein, Lukas [1 ,2 ]
Zelinka, Ivan [1 ]
Seidl, David [1 ,2 ]
机构
[1] VSB Tech Univ Ostrava, Dept Comp Sci, FEI, Ostrava 70800, Czech Republic
[2] VSB Tech Univ Ostrava, ENET Ctr, Ostrava 70800, Czech Republic
关键词
Self-Organizing Migrating Algorithm; Optimization algorithm; Swarm intelligence; Numerical optimization; Reinforcement learning;
D O I
10.1016/j.swevo.2024.101487
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a new algorithm for optimizing parameters in swarm algorithm using reinforcement learning. The algorithm, called iSOMA-RL, is based on the iSOMA algorithm, a population-based optimization algorithm that mimics the competition-cooperation behavior of creatures to find the optimal solution. By using reinforcement learning, iSOMA-RL can dynamically and continuously optimize parameters, which can play a crucial role in determining the performance of the algorithm but are often difficult to determine. The reinforcement learning technique used is the state -of -the -art Proximal Policy Optimization (PPO), which has been successful in many areas. The algorithm was compared to the original iSOMA algorithm and other algorithms from the SOMA family, showing better performance with only constant increase in computational complexity depending on number of function evaluations. Also we examine different sets of parameters to optimize and different reward functions. We also did comparison to widely used and state -of -the -art algorithms to illustrate improvement in performance over the original iSOMA algorithm.
引用
收藏
页数:17
相关论文
共 81 条
[1]  
Akanksha Eisha, 2021, Proceedings of 5th International Conference on Computing Methodologies and Communication (ICCMC 2021), P1416, DOI 10.1109/ICCMC51019.2021.9418283
[2]  
Akkaya I, 2019, Arxiv, DOI arXiv:1910.07113
[3]   Deep Reinforcement Learning A brief survey [J].
Arulkumaran, Kai ;
Deisenroth, Marc Peter ;
Brundage, Miles ;
Bharath, Anil Anthony .
IEEE SIGNAL PROCESSING MAGAZINE, 2017, 34 (06) :26-38
[4]   Taking Gradients Through Experiments: LSTMs and Memory Proximal Policy Optimization for Black-Box Quantum Control [J].
August, Moritz ;
Hernandez-Lobato, Jose Miguel .
HIGH PERFORMANCE COMPUTING, ISC HIGH PERFORMANCE 2018, 2018, 11203 :591-613
[5]  
Awad N, 2015, IEEE C EVOL COMPUTAT, P1098, DOI 10.1109/CEC.2015.7257012
[6]  
Basak H., 2021, arXiv, DOI [10.48550/ARXIV.2107.14199, DOI 10.48550/ARXIV.2107.14199]
[7]  
Berner C., 2019, DOTA 2 LARGE SCALE D
[8]  
Bohn E, 2019, INT CONF UNMAN AIRCR, P523, DOI [10.1109/icuas.2019.8798254, 10.1109/ICUAS.2019.8798254]
[9]  
Brest J, 2016, IEEE C EVOL COMPUTAT, P1188, DOI 10.1109/CEC.2016.7743922
[10]  
Brockman G, 2016, Arxiv, DOI arXiv:1606.01540