Predictive reinforcement learning in non-stationary environments using weighted mixture policy

被引:0
作者
Pourshamsaei, Hossein [1 ]
Nobakhti, Amin [1 ]
机构
[1] Sharif Univ Technol, Dept Elect Engn, Azadi Ave, Tehran 111554363, Iran
关键词
Reinforcement learning; Non-stationary environments; Adaptive learning rate; Mixture policy; Predictive reference tracking; MODEL;
D O I
10.1016/j.asoc.2024.111305
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement Learning (RL) within non-stationary environments presents a formidable challenge. In some applications, anticipating abrupt alterations in the environment model might be possible. The existing literature lacks a framework that proactively harnesses such predictions to enhance reward optimization. This paper introduces an innovative methodology designed to preemptively leverage these predictions, thereby maximizing the overall achieved performance. This is executed by formulating a novel approach that generates a weighted mixture policy from both the optimal policies of the prevailing and forthcoming models. To ensure safe learning, an adaptive learning rate is derived to facilitate training of the weighted mixture policy. This theoretically guarantees monotonic performance improvement at each update during training. Empirical trials focus on a model-free predictive reference tracking scenario involving piecewise constant references. Through the utilization of the cart-pole position control problem, it is demonstrated that the proposed algorithm surpasses prior techniques such as context Q-learning and RL with context detection algorithms in nonstationary environments. Moreover, the algorithm outperforms the application of individual optimal policies derived from each observed environment model (i.e., policies not utilizing predictions).
引用
收藏
页数:16
相关论文
共 50 条
  • [41] Mixed Reinforcement Learning for Efficient Policy Optimization in Stochastic Environments
    Mu, Yao
    Peng, Baiyu
    Gu, Ziqing
    Li, Shengbo Eben
    Liu, Chang
    Nie, Bingbing
    Zheng, Jianfeng
    Zhang, Bo
    2020 20TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS), 2020, : 1212 - 1219
  • [42] Optimal policy for a dynamic, non-stationary, stochastic inventory problem with capacity commitment
    Xu, Ningxiong
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2009, 199 (02) : 400 - 408
  • [43] LLM-Informed Multi-Armed Bandit Strategies for Non-Stationary Environments
    de Curto, J.
    de Zarza, I.
    Roig, Gemma
    Cano, Juan Carlos
    Manzoni, Pietro
    Calafate, Carlos T.
    ELECTRONICS, 2023, 12 (13)
  • [44] Statistics of extreme ocean environments: Non-stationary inference for directionality and other covariate effects
    Jones, Matthew
    Randell, David
    Ewans, Kevin
    Jonathan, Philip
    OCEAN ENGINEERING, 2016, 119 : 30 - 46
  • [45] Predictive Learning Model in Cognitive Radio using Reinforcement Learning
    Tubachi, Sharada
    Venkatesan, Mithra
    Kulkarni, A., V
    2017 IEEE INTERNATIONAL CONFERENCE ON POWER, CONTROL, SIGNALS AND INSTRUMENTATION ENGINEERING (ICPCSI), 2017, : 564 - 567
  • [46] Non-stationary time-varying vehicular channel characteristics for different roadside scattering environments
    Li, Changzhen
    Chen, Wei
    Pei, Zhonghui
    Chang, Fuxing
    Yu, Junyi
    Luo, Fan
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [47] Density-based Core Support Extraction for Non-stationary Environments with Extreme Verification Latency
    Ferreira, Raul S.
    da Silva, Bruno M. A.
    Teixeira, Wendell W.
    Zimbrao, Geraldo
    Alvim, Leandro
    2018 7TH BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 2018, : 181 - 187
  • [48] Power Laws Derived from a Bayesian Decision-Making Model in Non-Stationary Environments
    Shinohara, Shuji
    Manome, Nobuhito
    Nakajima, Yoshihiro
    Gunji, Yukio Pegio
    Moriyama, Toru
    Okamoto, Hiroshi
    Mitsuyoshi, Shunji
    Chung, Ung-il
    SYMMETRY-BASEL, 2021, 13 (04):
  • [49] An experimental review of the ensemble-based data stream classification algorithms in non-stationary environments
    Khezri, Shirin
    Tanha, Jafar
    Samadi, Negin
    COMPUTERS & ELECTRICAL ENGINEERING, 2024, 118
  • [50] Object Manipulation in Marine Environments using Reinforcement Learning
    Nader, Ahmed
    Din, Muhayy Ud
    Irfan, Mughni
    Hussain, Irfan
    IFAC PAPERSONLINE, 2024, 58 (20): : 215 - 222