Learning Latent and Changing Dynamics in Real Non-Stationary Environments

被引：0

作者：

Liu, Zihe ^{[1
]}

Lu, Jie ^{[1
]}

Xuan, Junyu ^{[1
]}

Zhang, Guangquan ^{[1
]}

机构：

[1] Univ Technol Sydney, Australian Artificial Intelligence Inst AAII, Ultimo, NSW 2007, Australia

来源：

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING | 2025年 / 37卷 / 04期

基金：

澳大利亚研究理事会;

关键词：

Adaptation models; Reinforcement learning; Computational modeling; Heuristic algorithms; Robots; Planning; Complexity theory; Video games; Training; Predictive models; non-stationary environments; model adaptation;

D O I：

10.1109/TKDE.2025.3535961

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Model-based reinforcement learning (RL) aims to learn the underlying dynamics of a given environment. The success of most existing works is built on the critical assumption that the dynamic is fixed, which is unrealistic in many open-world scenarios, such as drone delivery and online chatting, where agents may need to deal with environments with unpredictable changing dynamics (hereafter, real non-stationary environment). Therefore, learning changing dynamics in a real non-stationary environment offers both significant benefits and challenges. This paper proposes a new model-based reinforcement learning algorithm that proactively and dynamically detects possible changes and Learns these Latent and Changing Dynamics (LLCD) in a latent Markovian space for real non-stationary environments. To ensure the Markovian property of the RL model and improve computational efficiency, we employ a latent space model to learn the environment's transition dynamics. Furthermore, we perform online change detection in the latent space to promptly identify change points in non-stationary environments. Then, we utilize the detected information to help the agent adapt to new conditions. Experiments indicate that the rewards of the proposed algorithm accumulate for the most rapid adaptions to environmental change, among other benefits. This work has a strong potential to enhance environmentally suitable model-based reinforcement learning capabilities.

引用

页码：1930 / 1942

页数：13

共 50 条

[1] Social Learning in non-stationary environments
Boursier, Etienne
Perchet, Vianney
Scarsini, Marco
INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 167, 2022, 167
[2] Learning non-stationary Langevin dynamics from stochastic observations of latent trajectories
Genkin, Mikhail
Hughes, Owen
Engel, Tatiana A.
NATURE COMMUNICATIONS, 2021, 12 (01)
[3] Learning non-stationary Langevin dynamics from stochastic observations of latent trajectories
Mikhail Genkin
Owen Hughes
Tatiana A. Engel
Nature Communications, 12
[4] Learning User Preferences in Non-Stationary Environments
Huleihel, Wasim
Pal, Soumyabrata
Shayevitz, Ofer
24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
[5] Towards Reinforcement Learning for Non-stationary Environments
Dal Toe, Sebastian Gregory
Tiddeman, Bernard
Mac Parthalain, Neil
ADVANCES IN COMPUTATIONAL INTELLIGENCE SYSTEMS, UKCI 2023, 2024, 1453 : 41 - 52
[6] Learning to negotiate optimally in non-stationary environments
Narayanan, Vidya
Jennings, Nicholas R.
COOPERATIVE INFORMATION AGENTS X, PROCEEDINGS, 2006, 4149 : 288 - 300
[7] Reinforcement learning algorithm for non-stationary environments
Padakandla, Sindhu
Prabuchandran, K. J.
Bhatnagar, Shalabh
APPLIED INTELLIGENCE, 2020, 50 (11) : 3590 - 3606
[8] Reinforcement learning algorithm for non-stationary environments
Sindhu Padakandla
Prabuchandran K. J.
Shalabh Bhatnagar
Applied Intelligence, 2020, 50 : 3590 - 3606
[9] A robust incremental learning method for non-stationary environments
Martinez-Rego, David
Perez-Sanchez, Beatriz
Fontenla-Romero, Oscar
Alonso-Betanzos, Amparo
NEUROCOMPUTING, 2011, 74 (11) : 1800 - 1808
[10] Learning Optimal Behavior in Environments with Non-stationary Observations
Boone, Ilio
Rens, Gavin
ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 3, 2022, : 729 - 736

← 1 2 3 4 5 →