Predictive reinforcement learning in non-stationary environments using weighted mixture policy

被引：0

作者：

Pourshamsaei, Hossein ^{[1
]}

Nobakhti, Amin ^{[1
]}

机构：

[1] Sharif Univ Technol, Dept Elect Engn, Azadi Ave, Tehran 111554363, Iran

来源：

APPLIED SOFT COMPUTING | 2024年 / 153卷

关键词：

Reinforcement learning; Non-stationary environments; Adaptive learning rate; Mixture policy; Predictive reference tracking; MODEL;

D O I：

10.1016/j.asoc.2024.111305

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement Learning (RL) within non-stationary environments presents a formidable challenge. In some applications, anticipating abrupt alterations in the environment model might be possible. The existing literature lacks a framework that proactively harnesses such predictions to enhance reward optimization. This paper introduces an innovative methodology designed to preemptively leverage these predictions, thereby maximizing the overall achieved performance. This is executed by formulating a novel approach that generates a weighted mixture policy from both the optimal policies of the prevailing and forthcoming models. To ensure safe learning, an adaptive learning rate is derived to facilitate training of the weighted mixture policy. This theoretically guarantees monotonic performance improvement at each update during training. Empirical trials focus on a model-free predictive reference tracking scenario involving piecewise constant references. Through the utilization of the cart-pole position control problem, it is demonstrated that the proposed algorithm surpasses prior techniques such as context Q-learning and RL with context detection algorithms in nonstationary environments. Moreover, the algorithm outperforms the application of individual optimal policies derived from each observed environment model (i.e., policies not utilizing predictions).

引用

页数：16

共 50 条

[1] Reinforcement learning algorithm for non-stationary environments
Sindhu Padakandla
Prabuchandran K. J.
Shalabh Bhatnagar
Applied Intelligence, 2020, 50 : 3590 - 3606
[2] Reinforcement learning algorithm for non-stationary environments
Padakandla, Sindhu
Prabuchandran, K. J.
Bhatnagar, Shalabh
APPLIED INTELLIGENCE, 2020, 50 (11) : 3590 - 3606
[3] Towards Reinforcement Learning for Non-stationary Environments
Dal Toe, Sebastian Gregory
Tiddeman, Bernard
Mac Parthalain, Neil
ADVANCES IN COMPUTATIONAL INTELLIGENCE SYSTEMS, UKCI 2023, 2024, 1453 : 41 - 52
[4] An adaptable fuzzy reinforcement learning method for non-stationary environments
Haighton, Rachel
Asgharnia, Amirhossein
Schwartz, Howard
Givigi, Sidney
NEUROCOMPUTING, 2024, 604
[5] A robust policy bootstrapping algorithm for multi-objective reinforcement learning in non-stationary environments
Abdelfattah, Sherif
Kasmarik, Kathryn
Hu, Jiankun
ADAPTIVE BEHAVIOR, 2020, 28 (04) : 273 - 292
[6] ENHANCED DEEP REINFORCEMENT LEARNING FOR PARCEL SINGULATION IN NON-STATIONARY ENVIRONMENTS
Shen, Jiwei
Lu, Hu
Zhang, Hao
Lyu, Shujing
Lu, Yue
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 86 - 90
[7] Learning Latent and Changing Dynamics in Real Non-Stationary Environments
Liu, Zihe
Lu, Jie
Xuan, Junyu
Zhang, Guangquan
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2025, 37 (04) : 1930 - 1942
[8] Prediction-Based Multi-Agent Reinforcement Learning in Inherently Non-Stationary Environments
Marinescu, Andrei
Dusparic, Ivana
Clarke, Siobhan
ACM TRANSACTIONS ON AUTONOMOUS AND ADAPTIVE SYSTEMS, 2017, 12 (02)
[9] Adaptive Learning With Extreme Verification Latency in Non-Stationary Environments
Idrees, Mobin M. M.
Stahl, Frederic
Badii, Atta
IEEE ACCESS, 2022, 10 : 127345 - 127364
[10] Reliable Localized On-line Learning in Non-stationary Environments
Buschermoehle, Andreas
Brockmann, Werner
2014 IEEE CONFERENCE ON EVOLVING AND ADAPTIVE INTELLIGENT SYSTEMS (EAIS), 2014,

← 1 2 3 4 5 →