Sample-Efficient Deep Reinforcement Learning with Directed Associative Graph

被引：0

作者：

Yang, Dujia ^{[1
,2
]}

Qin, Xiaowei ^{[1
,2
]}

Xu, Xiaodong ^{[1
,2
]}

Li, Chensheng ^{[1
,2
]}

Wei, Guo ^{[1
,2
]}

机构：

[1] Univ Sci & Technol China, Hefei 230026, Peoples R China

[2] CAS Key Lab Wireless Opt Commun, Hefei 230027, Peoples R China

来源：

CHINA COMMUNICATIONS | 2021年 / 18卷 / 06期

关键词：

directed associative graph; sample efficiency; deep reinforcement learning;

D O I：

暂无

中图分类号：

TN [电子技术、通信技术];

学科分类号：

0809 ;

摘要：

Reinforcement learning can be modeled as markov decision process mathematically. In consequence, the interaction samples as well as the connection relation between them are two main types of information for learning. However, most of recent works on deep reinforcement learning treat samples independently either in their own episode or between episodes. In this paper, in order to utilize more sample information, we propose another learning system based on directed associative graph (DAG). The DAG is built on all trajectories in real time, which includes the whole connection relation of all samples among all episodes. Through planning with directed edges on DAG, we offer another perspective to estimate state-action pair, especially for the unknowns to deep neural network (DNN) as well as episodic memory (EM). Mixed loss function is generated by the three learning systems (DNN, EM and DAG) to improve the efficiency of the parameter update in the proposed algorithm. We show that our algorithm is significantly better than the state-of-the-art algorithm in performance and sample efficiency on testing environments. Furthermore, the convergence of our algorithm is proved in the appendix and its long-term performance as well as the effects of DAG are verified.

引用

页码：100 / 113

页数：14

共 50 条

[21] Relative Entropy Regularized Sample-Efficient Reinforcement Learning With Continuous Actions
Shang, Zhiwei
Li, Renxing
Zheng, Chunhua
Li, Huiyun
Cui, Yunduan
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) : 475 - 485
[22] TEXPLORE: real-time sample-efficient reinforcement learning for robots
Hester, Todd
Stone, Peter
MACHINE LEARNING, 2013, 90 (03) : 385 - 429
[23] Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning
Xie, Tengyang
Jiang, Nan
Wang, Huan
Xiong, Caiming
Bai, Yu
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[24] Model-aided Deep Reinforcement Learning for Sample-efficient UAV Trajectory Design in IoT Networks
Esrafilian, Omid
Bayerlein, Harald
Gesbert, David
2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,
[25] Augmented Memory: Sample-Efficient Generative Molecular Design with Reinforcement Learning
Guo, Jeff
Schwaller, Philippe
JACS AU, 2024, 4 (06): : 2160 - 2172
[26] TEXPLORE: real-time sample-efficient reinforcement learning for robots
Todd Hester
Peter Stone
Machine Learning, 2013, 90 : 385 - 429
[27] Sample-efficient model-based reinforcement learning for quantum control
Khalid, Irtaza
Weidner, Carrie A.
Jonckheere, Edmond A.
Schirmer, Sophie G.
Langbein, Frank C.
PHYSICAL REVIEW RESEARCH, 2023, 5 (04):
[28] Sample-Efficient Blockage Prediction and Handover Using Causal Reinforcement Learning
Kanagamani, Tamizharasan
Sadasivan, Jishnu
Banerjee, Serene
10TH INTERNATIONAL CONFERENCE ON ELECTRONICS, COMPUTING AND COMMUNICATION TECHNOLOGIES, CONECCT 2024, 2024,
[29] Sample-efficient multi-agent reinforcement learning with masked reconstruction
Kim, Jung In
Lee, Young Jae
Heo, Jongkook
Park, Jinhyeok
Kim, Jaehoon
Lim, Sae Rin
Jeong, Jinyong
Kim, Seoung Bum
PLOS ONE, 2023, 18 (09):
[30] Sample-Efficient Cardinality Estimation Using Geometric Deep Learning
Reiner, Silvan
Grossniklaus, Michael
PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 17 (04): : 740 - 752

← 1 2 3 4 5 →