Autonomous reinforcement learning with experience replay

被引：38

作者：

Wawrzynski, Pawel ^{[1
]}

Tanwani, Ajay Kumar ^{[1
,2
]}

机构：

[1] Warsaw Univ Technol, Inst Control & Computat Engn, Warsaw, Poland

[2] Ecole Polytech Fed Lausanne, CH-1015 Lausanne, Switzerland

来源：

NEURAL NETWORKS | 2013年 / 41卷

关键词：

Reinforcement learning; Autonomous learning; Step-size estimation; Actor-critic; ACTOR-CRITIC ALGORITHMS; RATE ADAPTATION; ENVIRONMENTS; CONVERGENCE; NETWORKS;

D O I：

10.1016/j.neunet.2012.11.007

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper considers the issues of efficiency and autonomy that are required to make reinforcement learning suitable for real-life control tasks. A real-time reinforcement learning algorithm is presented that repeatedly adjusts the control policy with the use of previously collected samples, and autonomously estimates the appropriate step-sizes for the learning updates. The algorithm is based on the actor-critic with experience replay whose step-sizes are determined on-line by an enhanced fixed point algorithm for on-line neural network training. An experimental study with simulated octopus arm and half-cheetah demonstrates the feasibility of the proposed algorithm to solve difficult learning control problems in an autonomous way within reasonably short time. (c) 2012 Elsevier Ltd. All rights reserved.

引用

页码：156 / 167

页数：12

共 50 条

[21] Trial and Error Experience Replay Based Deep Reinforcement Learning
Zhang, Cheng
Ma, Liang
4TH IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD 2019) / 3RD INTERNATIONAL SYMPOSIUM ON REINFORCEMENT LEARNING (ISRL 2019), 2019, : 221 - 226
[22] Asynchronous Curriculum Experience Replay: A Deep Reinforcement Learning Approach for UAV Autonomous Motion Control in Unknown Dynamic Environments
Hu, Zijian
Gao, Xiaoguang
Wan, Kaifang
Wang, Qianglong
Zhai, Yiwei
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (11) : 13985 - 14001
[23] Balanced prioritized experience replay in off-policy reinforcement learning
Lou Z.
Wang Y.
Shan S.
Zhang K.
Wei H.
Neural Computing and Applications, 2024, 36 (25) : 15721 - 15737
[24] Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
Foerster, Jakob
Nardelli, Nantas
Farquhar, Gregory
Afouras, Triantafyllos
Torr, Philip H. S.
Kohli, Pushmeet
Whiteson, Shimon
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[25] Robust experience replay sampling for multi-agent reinforcement learning
Nicholaus, Isack Thomas
Kang, Dae-Ki
PATTERN RECOGNITION LETTERS, 2022, 155 : 135 - 142
[26] Enhanced Off-Policy Reinforcement Learning With Focused Experience Replay
Kong, Seung-Hyun
Nahrendra, I. Made Aswin
Paek, Dong-Hee
IEEE ACCESS, 2021, 9 (09): : 93152 - 93164
[27] Experience Replay Optimization via ESMM for Stable Deep Reinforcement Learning
Osei, Richard Sakyi
Lopez, Daphne
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (01) : 715 - 723
[28] Forgetful experience replay in hierarchical reinforcement learning from expert demonstrations
Skrynnik, Alexey
Staroverov, Aleksey
Aitygulov, Ermek
Aksenov, Kirill
Davydov, Vasilii
Panov, Aleksandr, I
KNOWLEDGE-BASED SYSTEMS, 2021, 218
[29] Invariant Transform Experience Replay: Data Augmentation for Deep Reinforcement Learning
Lin, Yijiong
Huang, Jiancong
Zimmer, Matthieu
Guan, Yisheng
Rojas, Juan
Weng, Paul
IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (04) : 6615 - 6622
[30] Regret Minimization Experience Replay in Off-Policy Reinforcement Learning
Liu, Xu-Hui
Xue, Zhenghai
Pang, Jing-Cheng
Jiang, Shengyi
Xu, Feng
Yu, Yang
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34

← 1 2 3 4 5 →