Predictive reinforcement learning in non-stationary environments using weighted mixture policy

被引:0
|
作者
Pourshamsaei, Hossein [1 ]
Nobakhti, Amin [1 ]
机构
[1] Sharif Univ Technol, Dept Elect Engn, Azadi Ave, Tehran 111554363, Iran
关键词
Reinforcement learning; Non-stationary environments; Adaptive learning rate; Mixture policy; Predictive reference tracking; MODEL;
D O I
10.1016/j.asoc.2024.111305
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement Learning (RL) within non-stationary environments presents a formidable challenge. In some applications, anticipating abrupt alterations in the environment model might be possible. The existing literature lacks a framework that proactively harnesses such predictions to enhance reward optimization. This paper introduces an innovative methodology designed to preemptively leverage these predictions, thereby maximizing the overall achieved performance. This is executed by formulating a novel approach that generates a weighted mixture policy from both the optimal policies of the prevailing and forthcoming models. To ensure safe learning, an adaptive learning rate is derived to facilitate training of the weighted mixture policy. This theoretically guarantees monotonic performance improvement at each update during training. Empirical trials focus on a model-free predictive reference tracking scenario involving piecewise constant references. Through the utilization of the cart-pole position control problem, it is demonstrated that the proposed algorithm surpasses prior techniques such as context Q-learning and RL with context detection algorithms in nonstationary environments. Moreover, the algorithm outperforms the application of individual optimal policies derived from each observed environment model (i.e., policies not utilizing predictions).
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Reinforcement learning algorithm for non-stationary environments
    Sindhu Padakandla
    Prabuchandran K. J.
    Shalabh Bhatnagar
    Applied Intelligence, 2020, 50 : 3590 - 3606
  • [2] Reinforcement learning algorithm for non-stationary environments
    Padakandla, Sindhu
    Prabuchandran, K. J.
    Bhatnagar, Shalabh
    APPLIED INTELLIGENCE, 2020, 50 (11) : 3590 - 3606
  • [3] Towards Reinforcement Learning for Non-stationary Environments
    Dal Toe, Sebastian Gregory
    Tiddeman, Bernard
    Mac Parthalain, Neil
    ADVANCES IN COMPUTATIONAL INTELLIGENCE SYSTEMS, UKCI 2023, 2024, 1453 : 41 - 52
  • [4] An adaptable fuzzy reinforcement learning method for non-stationary environments
    Haighton, Rachel
    Asgharnia, Amirhossein
    Schwartz, Howard
    Givigi, Sidney
    NEUROCOMPUTING, 2024, 604
  • [5] A robust policy bootstrapping algorithm for multi-objective reinforcement learning in non-stationary environments
    Abdelfattah, Sherif
    Kasmarik, Kathryn
    Hu, Jiankun
    ADAPTIVE BEHAVIOR, 2020, 28 (04) : 273 - 292
  • [6] ENHANCED DEEP REINFORCEMENT LEARNING FOR PARCEL SINGULATION IN NON-STATIONARY ENVIRONMENTS
    Shen, Jiwei
    Lu, Hu
    Zhang, Hao
    Lyu, Shujing
    Lu, Yue
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 86 - 90
  • [7] Learning Latent and Changing Dynamics in Real Non-Stationary Environments
    Liu, Zihe
    Lu, Jie
    Xuan, Junyu
    Zhang, Guangquan
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2025, 37 (04) : 1930 - 1942
  • [8] Prediction-Based Multi-Agent Reinforcement Learning in Inherently Non-Stationary Environments
    Marinescu, Andrei
    Dusparic, Ivana
    Clarke, Siobhan
    ACM TRANSACTIONS ON AUTONOMOUS AND ADAPTIVE SYSTEMS, 2017, 12 (02)
  • [9] Adaptive Learning With Extreme Verification Latency in Non-Stationary Environments
    Idrees, Mobin M. M.
    Stahl, Frederic
    Badii, Atta
    IEEE ACCESS, 2022, 10 : 127345 - 127364
  • [10] Reliable Localized On-line Learning in Non-stationary Environments
    Buschermoehle, Andreas
    Brockmann, Werner
    2014 IEEE CONFERENCE ON EVOLVING AND ADAPTIVE INTELLIGENT SYSTEMS (EAIS), 2014,