Robust Black-Box Optimization for Stochastic Search and Episodic Reinforcement Learning

被引:0
作者
Huttenrauch, Maximilian [1 ]
Neumann, Gerhard [1 ]
机构
[1] Karlsruhe Inst Technol, Dept Comp Sci, Karlsruhe, Germany
关键词
black-box optimization; stochastic search; derivative-free optimization; evolution strategies; episodic reinforcement learning; EVOLUTIONARY;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Black -box optimization is a versatile approach to solve complex problems where the objective function is not explicitly known and no higher order information is available. Due to its general nature, it finds widespread applications in function optimization as well as machine learning, especially episodic reinforcement learning tasks. While traditional black -box optimizers like CMA-ES may falter in noisy scenarios due to their reliance on ranking -based transformations, a promising alternative emerges in the form of the Model -based Relative Entropy Stochastic Search (MORE) algorithm. MORE can be derived from natural policy gradients and compatible function approximation and directly optimizes the expected fitness without resorting to rankings. However, in its original formulation, MORE often cannot achieve state of the art performance. In this paper, we improve MORE by decoupling the update of the search distribution's mean and covariance and an improved entropy scheduling technique based on an evolution path resulting in faster convergence, and a simplified model learning approach in comparison to the original paper. We show that our algorithm performs comparable to state-of-the-art black -box optimizers on standard benchmark functions. Further, it clearly outperforms ranking -based methods and other policy -gradient based black -box algorithms as well as state of the art deep reinforcement learning algorithms when used for episodic reinforcement learning tasks.
引用
收藏
页码:1 / 44
页数:44
相关论文
共 50 条
[1]   Distributed Evolution Strategies for Black-Box Stochastic Optimization [J].
He, Xiaoyu ;
Zheng, Zibin ;
Chen, Chuan ;
Zhou, Yuren ;
Luo, Chuan ;
Lin, Qingwei .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) :3718-3731
[2]   Black-box Optimization of CT Acquisition and Reconstruction Parameters: A Reinforcement Learning Approach [J].
Fenwick, David ;
NaderiAlizadeh, Navid ;
Tarokh, Vahid ;
Clark, Darin ;
Rajagopal, Jayasai ;
Kapadia, Anuj ;
Felice, Nicholas ;
Samei, Ehsan ;
Abadi, Ehsan .
MEDICAL IMAGING 2025: PHYSICS OF MEDICAL IMAGING, PT 1, 2025, 13405
[3]   Directed Exploration in Black-Box Optimization for Multi-Objective Reinforcement Learning [J].
Garcia, Javier ;
Iglesias, Roberto ;
Rodriguez, Miguel A. ;
Regueiro, Carlos, V .
INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & DECISION MAKING, 2019, 18 (03) :1045-1082
[4]   Log Barriers for Safe Black-box Optimization with Application to Safe Reinforcement Learning [J].
Usmanova, Ilnura ;
As, Yarden ;
Kamgarpour, Maryam ;
Krause, Andreas .
JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
[5]   Meta-Learning for Black-Box Optimization [J].
Vishnu, T. V. ;
Malhotra, Pankaj ;
Narwariya, Jyoti ;
Vig, Lovekesh ;
Shroff, Gautam .
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II, 2020, 11907 :366-381
[6]   DiBB: Distributing Black-Box Optimization [J].
Cuccu, Giuseppe ;
Rolshoven, Luca ;
Vorpe, Fabien ;
Cudre-Mauroux, Philippe ;
Glasmachers, Tobias .
PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'22), 2022, :341-349
[7]   A black-box scatter search for optimization problems with integer variables [J].
Laguna, Manuel ;
Gortazar, Francisco ;
Gallego, Micael ;
Duarte, Abraham ;
Marti, Rafael .
JOURNAL OF GLOBAL OPTIMIZATION, 2014, 58 (03) :497-516
[8]   A black-box scatter search for optimization problems with integer variables [J].
Manuel Laguna ;
Francisco Gortázar ;
Micael Gallego ;
Abraham Duarte ;
Rafael Martí .
Journal of Global Optimization, 2014, 58 :497-516
[9]   Black-Box Optimization in an Extended Search Space for SAT Solving [J].
Zaikin, Oleg ;
Kochemazov, Stepan .
MATHEMATICAL OPTIMIZATION THEORY AND OPERATIONS RESEARCH, 2019, 11548 :402-417
[10]   Policy Learning with an Effcient Black-Box Optimization Algorithm [J].
Hwangbo, Jemin ;
Gehring, Christian ;
Sommer, Hannes ;
Siegwart, Roland ;
Buchli, Jonas .
INTERNATIONAL JOURNAL OF HUMANOID ROBOTICS, 2015, 12 (03)