Model-Based Reinforcement Learning With Isolated Imaginations

被引：0

作者：

Pan, Minting ^{[1
]}

Zhu, Xiangming ^{[1
]}

Zheng, Yitao ^{[1
]}

Wang, Yunbo ^{[1
]}

Yang, Xiaokang ^{[1
]}

机构：

[1] Shanghai Jiao Tong Univ, AI Inst, MoE Key Lab Artificial Intelligence, Shanghai 200240, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2024年 / 46卷 / 05期

基金：

中国国家自然科学基金;

关键词：

Decoupled dynamics; model-based reinforcement learning; world model;

D O I：

10.1109/TPAMI.2023.3335263

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

World models learn the consequences of actions in vision-based interactive systems. However, in practical scenarios like autonomous driving, noncontrollable dynamics that are independent or sparsely dependent on action signals often exist, making it challenging to learn effective world models. To address this issue, we propose Iso-Dream++, a model-based reinforcement learning approach that has two main contributions. First, we optimize the inverse dynamics to encourage the world model to isolate controllable state transitions from the mixed spatiotemporal variations of the environment. Second, we perform policy optimization based on the decoupled latent imaginations, where we roll out noncontrollable states into the future and adaptively associate them with the current controllable state. This enables long-horizon visuomotor control tasks to benefit from isolating mixed dynamics sources in the wild, such as self-driving cars that can anticipate the movement of other vehicles, thereby avoiding potential risks. On top of our previous work (Pan et al. 2022), we further consider the sparse dependencies between controllable and noncontrollable states, address the training collapse problem of state decoupling, and validate our approach in transfer learning setups. Our empirical study demonstrates that Iso-Dream++ outperforms existing reinforcement learning models significantly on CARLA and DeepMind Control.

引用

页码：2788 / 2803

页数：16

共 50 条

[31] Reward-respecting subtasks for model-based reinforcement learning
Suttona, Richard S.
Machado, Marlos C.
Holland, Zacharias
Szepesvari, David
Timbers, Finbarr
Tanner, Brian
White, Adam
ARTIFICIAL INTELLIGENCE, 2023, 324
[32] Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model
Li, Gen
Wei, Yuting
Chi, Yuejie
Chen, Yuxin
OPERATIONS RESEARCH, 2024, 72 (01) : 203 - 221
[33] A model-based reinforcement learning method based on conditional generative adversarial networks
Zhao, Tingting
Wang, Ying
Li, Guixi
Kong, Le
Chen, Yarui
Wang, Yuan
Xie, Ning
Yang, Jucheng
PATTERN RECOGNITION LETTERS, 2021, 152 : 18 - 25
[34] Data-efficient model-based reinforcement learning with trajectory discrimination
Qu, Tuo
Duan, Fuqing
Zhang, Junge
Zhao, Bo
Huang, Wenzhen
COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (02) : 1927 - 1936
[35] Exploration Versus Exploitation in Model-Based Reinforcement Learning: An Empirical Study
Lovatto, Angelo Gregorio
de Barros, Leliane Nunes
Maua, Denis D.
INTELLIGENT SYSTEMS, PT II, 2022, 13654 : 30 - 44
[36] Towards Model-Based Reinforcement Learning for Industry-Near Environments
Andersen, Per-Arne
Goodwin, Morten
Granmo, Ole-Christoffer
ARTIFICIAL INTELLIGENCE XXXVI, 2019, 11927 : 36 - 49
[37] Offline model-based reinforcement learning with causal structured world models
Zhu, Zhengmao
Tian, Honglong
Chen, Xionghui
Zhang, Kun
Yu, Yang
FRONTIERS OF COMPUTER SCIENCE, 2025, 19 (04)
[38] Model-Based Graph Reinforcement Learning for Inductive Traffic Signal Control
Devailly, Francois-Xavier
Larocque, Denis
Charlin, Laurent
IEEE OPEN JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 5 : 238 - 250
[39] Model-Based Reinforcement Learning in Multiagent Systems with Sequential Action Selection
Akramizadeh, Ali
Afshar, Ahmad
Menhaj, Mohammad Bagher
Jafari, Samira
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2011, E94D (02): : 255 - 263
[40] Model-Based Reinforcement Learning with Hierarchical Control for Dynamic Uncertain Environments
Oesterdiekhoff, Annika
Heinrich, Nils Wendel
Russwinkel, Nele
Kopp, Stefan
INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 2, INTELLISYS 2024, 2024, 1066 : 626 - 642

← 1 2 3 4 5 →