Autonomous reinforcement learning with experience replay

被引:38
|
作者
Wawrzynski, Pawel [1 ]
Tanwani, Ajay Kumar [1 ,2 ]
机构
[1] Warsaw Univ Technol, Inst Control & Computat Engn, Warsaw, Poland
[2] Ecole Polytech Fed Lausanne, CH-1015 Lausanne, Switzerland
关键词
Reinforcement learning; Autonomous learning; Step-size estimation; Actor-critic; ACTOR-CRITIC ALGORITHMS; RATE ADAPTATION; ENVIRONMENTS; CONVERGENCE; NETWORKS;
D O I
10.1016/j.neunet.2012.11.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper considers the issues of efficiency and autonomy that are required to make reinforcement learning suitable for real-life control tasks. A real-time reinforcement learning algorithm is presented that repeatedly adjusts the control policy with the use of previously collected samples, and autonomously estimates the appropriate step-sizes for the learning updates. The algorithm is based on the actor-critic with experience replay whose step-sizes are determined on-line by an enhanced fixed point algorithm for on-line neural network training. An experimental study with simulated octopus arm and half-cheetah demonstrates the feasibility of the proposed algorithm to solve difficult learning control problems in an autonomous way within reasonably short time. (c) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:156 / 167
页数:12
相关论文
共 50 条
  • [21] Trial and Error Experience Replay Based Deep Reinforcement Learning
    Zhang, Cheng
    Ma, Liang
    4TH IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD 2019) / 3RD INTERNATIONAL SYMPOSIUM ON REINFORCEMENT LEARNING (ISRL 2019), 2019, : 221 - 226
  • [22] Asynchronous Curriculum Experience Replay: A Deep Reinforcement Learning Approach for UAV Autonomous Motion Control in Unknown Dynamic Environments
    Hu, Zijian
    Gao, Xiaoguang
    Wan, Kaifang
    Wang, Qianglong
    Zhai, Yiwei
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (11) : 13985 - 14001
  • [23] Balanced prioritized experience replay in off-policy reinforcement learning
    Lou Z.
    Wang Y.
    Shan S.
    Zhang K.
    Wei H.
    Neural Computing and Applications, 2024, 36 (25) : 15721 - 15737
  • [24] Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
    Foerster, Jakob
    Nardelli, Nantas
    Farquhar, Gregory
    Afouras, Triantafyllos
    Torr, Philip H. S.
    Kohli, Pushmeet
    Whiteson, Shimon
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [25] Robust experience replay sampling for multi-agent reinforcement learning
    Nicholaus, Isack Thomas
    Kang, Dae-Ki
    PATTERN RECOGNITION LETTERS, 2022, 155 : 135 - 142
  • [26] Enhanced Off-Policy Reinforcement Learning With Focused Experience Replay
    Kong, Seung-Hyun
    Nahrendra, I. Made Aswin
    Paek, Dong-Hee
    IEEE ACCESS, 2021, 9 (09): : 93152 - 93164
  • [27] Experience Replay Optimization via ESMM for Stable Deep Reinforcement Learning
    Osei, Richard Sakyi
    Lopez, Daphne
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (01) : 715 - 723
  • [28] Forgetful experience replay in hierarchical reinforcement learning from expert demonstrations
    Skrynnik, Alexey
    Staroverov, Aleksey
    Aitygulov, Ermek
    Aksenov, Kirill
    Davydov, Vasilii
    Panov, Aleksandr, I
    KNOWLEDGE-BASED SYSTEMS, 2021, 218
  • [29] Invariant Transform Experience Replay: Data Augmentation for Deep Reinforcement Learning
    Lin, Yijiong
    Huang, Jiancong
    Zimmer, Matthieu
    Guan, Yisheng
    Rojas, Juan
    Weng, Paul
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (04) : 6615 - 6622
  • [30] Regret Minimization Experience Replay in Off-Policy Reinforcement Learning
    Liu, Xu-Hui
    Xue, Zhenghai
    Pang, Jing-Cheng
    Jiang, Shengyi
    Xu, Feng
    Yu, Yang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34