Sample-Efficient Deep Reinforcement Learning with Directed Associative Graph

被引：0

作者：

Yang, Dujia ^{[1
,2
]}

Qin, Xiaowei ^{[1
,2
]}

Xu, Xiaodong ^{[1
,2
]}

Li, Chensheng ^{[1
,2
]}

Wei, Guo ^{[1
,2
]}

机构：

[1] Univ Sci & Technol China, Hefei 230026, Peoples R China

[2] CAS Key Lab Wireless Opt Commun, Hefei 230027, Peoples R China

来源：

CHINA COMMUNICATIONS | 2021年 / 18卷 / 06期

关键词：

directed associative graph; sample efficiency; deep reinforcement learning;

D O I：

暂无

中图分类号：

TN [电子技术、通信技术];

学科分类号：

0809 ;

摘要：

Reinforcement learning can be modeled as markov decision process mathematically. In consequence, the interaction samples as well as the connection relation between them are two main types of information for learning. However, most of recent works on deep reinforcement learning treat samples independently either in their own episode or between episodes. In this paper, in order to utilize more sample information, we propose another learning system based on directed associative graph (DAG). The DAG is built on all trajectories in real time, which includes the whole connection relation of all samples among all episodes. Through planning with directed edges on DAG, we offer another perspective to estimate state-action pair, especially for the unknowns to deep neural network (DNN) as well as episodic memory (EM). Mixed loss function is generated by the three learning systems (DNN, EM and DAG) to improve the efficiency of the parameter update in the proposed algorithm. We show that our algorithm is significantly better than the state-of-the-art algorithm in performance and sample efficiency on testing environments. Furthermore, the convergence of our algorithm is proved in the appendix and its long-term performance as well as the effects of DAG are verified.

引用

页码：100 / 113

页数：14

共 50 条

[41] Sample-efficient Adversarial Imitation Learning
Jung, Dahuin
Lee, Hyungyu
Yoon, Sungroh
Journal of Machine Learning Research, 2024, 25 : 1 - 32
[42] Sample-efficient Adversarial Imitation Learning
Jung, Dahuin
Lee, Hyungyu
Yoon, Sungroh
JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
[43] Surrogate models for device design using sample-efficient Deep Learning?
Patel, Rutu
Mohapatra, Nihar R.
Hegde, Ravi S.
SOLID-STATE ELECTRONICS, 2023, 199
[44] Sample-efficient Adversarial Imitation Learning
Jung, Dahuin
Lee, Hyungyu
Yoon, Sungroh
JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 32
[45] Is Plug-in Solver Sample-Efficient for Feature-based Reinforcement Learning?
Cui, Qiwen
Yang, Lin F.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[46] Reinforcement Learning Boat Autopilot: A Sample-efficient and Model Predictive Control based Approach
Cui, Yunduan
Osaki, Shigeki
Matsubara, Takamitsu
2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 2868 - 2875
[47] Sample-efficient Actor-Critic Reinforcement Learning with Supervised Data for Dialogue Management
Su, Pei-Hao
Budzianowski, Pawel
Ultes, Stefan
Gasic, Milica
Young, Steve
18TH ANNUAL MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE (SIGDIAL 2017), 2017, : 147 - 157
[48] Dimension-Wise Importance Sampling Weight Clipping for Sample-Efficient Reinforcement Learning
Han, Seungyul
Sung, Youngchul
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[49] Higher Replay Ratio Empowers Sample-Efficient Multi-Agent Reinforcement Learning
Xu, Linjie
Liu, Zichuan
Dockhorn, Alexander
Perez-Liebana, Diego
Wang, Jinyu
Song, Lei
Bian, Jiang
2024 IEEE CONFERENCE ON GAMES, COG 2024, 2024,
[50] Sample-Efficient Model-Free Reinforcement Learning with Off-Policy Critics
Steckelmacher, Denis
Plisnier, Helene
Roijers, Diederik M.
Nowe, Ann
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT III, 2020, 11908 : 19 - 34

← 1 2 3 4 5 →