Model-Based Reinforcement Learning With Isolated Imaginations

被引:0
作者
Pan, Minting [1 ]
Zhu, Xiangming [1 ]
Zheng, Yitao [1 ]
Wang, Yunbo [1 ]
Yang, Xiaokang [1 ]
机构
[1] Shanghai Jiao Tong Univ, AI Inst, MoE Key Lab Artificial Intelligence, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
Decoupled dynamics; model-based reinforcement learning; world model;
D O I
10.1109/TPAMI.2023.3335263
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
World models learn the consequences of actions in vision-based interactive systems. However, in practical scenarios like autonomous driving, noncontrollable dynamics that are independent or sparsely dependent on action signals often exist, making it challenging to learn effective world models. To address this issue, we propose Iso-Dream++, a model-based reinforcement learning approach that has two main contributions. First, we optimize the inverse dynamics to encourage the world model to isolate controllable state transitions from the mixed spatiotemporal variations of the environment. Second, we perform policy optimization based on the decoupled latent imaginations, where we roll out noncontrollable states into the future and adaptively associate them with the current controllable state. This enables long-horizon visuomotor control tasks to benefit from isolating mixed dynamics sources in the wild, such as self-driving cars that can anticipate the movement of other vehicles, thereby avoiding potential risks. On top of our previous work (Pan et al. 2022), we further consider the sparse dependencies between controllable and noncontrollable states, address the training collapse problem of state decoupling, and validate our approach in transfer learning setups. Our empirical study demonstrates that Iso-Dream++ outperforms existing reinforcement learning models significantly on CARLA and DeepMind Control.
引用
收藏
页码:2788 / 2803
页数:16
相关论文
共 50 条
  • [31] Reward-respecting subtasks for model-based reinforcement learning
    Suttona, Richard S.
    Machado, Marlos C.
    Holland, Zacharias
    Szepesvari, David
    Timbers, Finbarr
    Tanner, Brian
    White, Adam
    ARTIFICIAL INTELLIGENCE, 2023, 324
  • [32] Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model
    Li, Gen
    Wei, Yuting
    Chi, Yuejie
    Chen, Yuxin
    OPERATIONS RESEARCH, 2024, 72 (01) : 203 - 221
  • [33] A model-based reinforcement learning method based on conditional generative adversarial networks
    Zhao, Tingting
    Wang, Ying
    Li, Guixi
    Kong, Le
    Chen, Yarui
    Wang, Yuan
    Xie, Ning
    Yang, Jucheng
    PATTERN RECOGNITION LETTERS, 2021, 152 : 18 - 25
  • [34] Data-efficient model-based reinforcement learning with trajectory discrimination
    Qu, Tuo
    Duan, Fuqing
    Zhang, Junge
    Zhao, Bo
    Huang, Wenzhen
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (02) : 1927 - 1936
  • [35] Exploration Versus Exploitation in Model-Based Reinforcement Learning: An Empirical Study
    Lovatto, Angelo Gregorio
    de Barros, Leliane Nunes
    Maua, Denis D.
    INTELLIGENT SYSTEMS, PT II, 2022, 13654 : 30 - 44
  • [36] Towards Model-Based Reinforcement Learning for Industry-Near Environments
    Andersen, Per-Arne
    Goodwin, Morten
    Granmo, Ole-Christoffer
    ARTIFICIAL INTELLIGENCE XXXVI, 2019, 11927 : 36 - 49
  • [37] Offline model-based reinforcement learning with causal structured world models
    Zhu, Zhengmao
    Tian, Honglong
    Chen, Xionghui
    Zhang, Kun
    Yu, Yang
    FRONTIERS OF COMPUTER SCIENCE, 2025, 19 (04)
  • [38] Model-Based Graph Reinforcement Learning for Inductive Traffic Signal Control
    Devailly, Francois-Xavier
    Larocque, Denis
    Charlin, Laurent
    IEEE OPEN JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 5 : 238 - 250
  • [39] Model-Based Reinforcement Learning in Multiagent Systems with Sequential Action Selection
    Akramizadeh, Ali
    Afshar, Ahmad
    Menhaj, Mohammad Bagher
    Jafari, Samira
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2011, E94D (02): : 255 - 263
  • [40] Model-Based Reinforcement Learning with Hierarchical Control for Dynamic Uncertain Environments
    Oesterdiekhoff, Annika
    Heinrich, Nils Wendel
    Russwinkel, Nele
    Kopp, Stefan
    INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 2, INTELLISYS 2024, 2024, 1066 : 626 - 642