DynaSTI: Dynamics modeling with sequential temporal information for reinforcement learning in Atari

被引:1
|
作者
Kim, Jaehoon [1 ]
Lee, Young Jae [1 ]
Kwak, Mingu [2 ]
Park, Young Joon [3 ]
Kim, Seoung Bum [1 ]
机构
[1] Korea Univ, Sch Ind Management Engn, 145 Anam Ro, Seoul 02841, South Korea
[2] Georgia Inst Technol, Sch Ind & Syst Engn, Atlanta, GA USA
[3] LG AI Res, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
Atari; Dynamics modeling; Hierarchical structure; Self-supervised learning; Reinforcement learning;
D O I
10.1016/j.knosys.2024.112103
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep reinforcement learning (DRL) has shown remarkable capabilities in solving sequential decision -making problems. However, DRL requires extensive interactions with image -based environments. Existing methods have combined self -supervised learning or data augmentation to improve sample efficiency. While understanding the temporal information dynamics of the environment is important for effective learning, many methods do not consider these factors. To address the sample efficiency problem, we propose dynamics modeling with sequential temporal information (DynaSTI) that incorporates environmental dynamics and leverages the correlation among trajectories to improve sample efficiency. DynaSTI uses an effective learning strategy for state representation as an auxiliary task, using gated recurrent units to capture temporal information. It also integrates forward and inverse dynamics modeling in a hierarchical configuration, enhancing the learning of environmental dynamics compared to using each model separately. The hierarchical structure of DynaSTI enhances the stability of inverse dynamics modeling during training by using inputs derived from forward dynamics modeling, which focuses on feature extraction related to controllable state. This approach effectively filters out noisy information. Consequently, using denoised inputs from forward dynamics modeling results in improved stability when training inverse dynamics modeling, rather than using inputs directly from the encoder. We demonstrate the effectiveness of DynaSTI through experiments on the Atari game benchmark, limiting the environment interactions to 100k steps. Our extensive experiments confirm that DynaSTI significantly improves the sample efficiency of DRL, outperforming comparison methods in terms of statistically reliable metrics and nearing human -level performance.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] STACoRe: Spatio-temporal and action-based contrastive representations for reinforcement learning in Atari
    Lee, Young Jae
    Kim, Jaehoon
    Kwak, Mingu
    Park, Young Joon
    Kim, Seoung Bum
    NEURAL NETWORKS, 2023, 160 : 1 - 11
  • [2] Domain Adaptation for Reinforcement Learning on the Atari
    Carr, Thomas
    Chli, Maria
    Vogiatzis, George
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1859 - 1861
  • [3] Visual Rationalizations in Deep Reinforcement Learning for Atari Games
    Weitkamp, Laurens
    van der Pol, Elise
    Akata, Zeynep
    ARTIFICIAL INTELLIGENCE, BNAIC 2018, 2019, 1021 : 151 - 165
  • [4] Playing Atari with Hybrid Quantum-Classical Reinforcement Learning
    Lockwood, Owen
    Si, Mei
    NEURIPS 2020 WORKSHOP ON PRE-REGISTRATION IN MACHINE LEARNING, VOL 148, 2020, 148 : 285 - 301
  • [5] Reinforcement Learning with Sequential Information Clustering in Real-Time Bidding
    Lu, Junwei
    Yang, Chaoqi
    Gao, Xiaofeng
    Wang, Liubin
    Li, Changcheng
    Chen, Guihai
    PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 1633 - 1641
  • [6] Modeling of plant dynamics and control based on reinforcement learning
    Maeda, Tomoyuki
    Nakayama, Makishi
    Kitamura, Akira
    2006 SICE-ICASE INTERNATIONAL JOINT CONFERENCE, VOLS 1-13, 2006, : 3088 - +
  • [7] Modeling and control for plant dynamics based on reinforcement learning
    Maeda, Tomoyuki
    Nakayama, Makishi
    Narazaki, Hiroshi
    Kitamura, Akira
    IEEJ Transactions on Industry Applications, 2009, 129 (04): : 363 - 367+2
  • [8] State of the Art Control of Atari Games Using Shallow Reinforcement Learning
    Liang, Yitao
    Machado, Marlos C.
    Talvitie, Erik
    Bowling, Michael
    AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 485 - 493
  • [9] Reinforcement Learning Framework for Modeling Spatial Sequential Decisions under Uncertainty
    Truc Viet Le
    Liu, Siyuan
    Lau, Hoong Chuin
    AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 1449 - 1450
  • [10] Blackbox Attacks on Reinforcement Learning Agents Using Approximated Temporal Information
    Zhao, Yiren
    Shumailov, Ilia
    Cui, Han
    Gaol, Xitong
    Mullins, Robert
    Anderson, Ross
    50TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS WORKSHOPS (DSN-W 2020), 2020, : 16 - 24