Sample-Efficient Deep Reinforcement Learning with Directed Associative Graph

被引:0
|
作者
Yang, Dujia [1 ,2 ]
Qin, Xiaowei [1 ,2 ]
Xu, Xiaodong [1 ,2 ]
Li, Chensheng [1 ,2 ]
Wei, Guo [1 ,2 ]
机构
[1] Univ Sci & Technol China, Hefei 230026, Peoples R China
[2] CAS Key Lab Wireless Opt Commun, Hefei 230027, Peoples R China
关键词
directed associative graph; sample efficiency; deep reinforcement learning;
D O I
暂无
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
Reinforcement learning can be modeled as markov decision process mathematically. In consequence, the interaction samples as well as the connection relation between them are two main types of information for learning. However, most of recent works on deep reinforcement learning treat samples independently either in their own episode or between episodes. In this paper, in order to utilize more sample information, we propose another learning system based on directed associative graph (DAG). The DAG is built on all trajectories in real time, which includes the whole connection relation of all samples among all episodes. Through planning with directed edges on DAG, we offer another perspective to estimate state-action pair, especially for the unknowns to deep neural network (DNN) as well as episodic memory (EM). Mixed loss function is generated by the three learning systems (DNN, EM and DAG) to improve the efficiency of the parameter update in the proposed algorithm. We show that our algorithm is significantly better than the state-of-the-art algorithm in performance and sample efficiency on testing environments. Furthermore, the convergence of our algorithm is proved in the appendix and its long-term performance as well as the effects of DAG are verified.
引用
收藏
页码:100 / 113
页数:14
相关论文
共 50 条
  • [21] Relative Entropy Regularized Sample-Efficient Reinforcement Learning With Continuous Actions
    Shang, Zhiwei
    Li, Renxing
    Zheng, Chunhua
    Li, Huiyun
    Cui, Yunduan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) : 475 - 485
  • [22] TEXPLORE: real-time sample-efficient reinforcement learning for robots
    Hester, Todd
    Stone, Peter
    MACHINE LEARNING, 2013, 90 (03) : 385 - 429
  • [23] Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning
    Xie, Tengyang
    Jiang, Nan
    Wang, Huan
    Xiong, Caiming
    Bai, Yu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [24] Model-aided Deep Reinforcement Learning for Sample-efficient UAV Trajectory Design in IoT Networks
    Esrafilian, Omid
    Bayerlein, Harald
    Gesbert, David
    2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,
  • [25] Augmented Memory: Sample-Efficient Generative Molecular Design with Reinforcement Learning
    Guo, Jeff
    Schwaller, Philippe
    JACS AU, 2024, 4 (06): : 2160 - 2172
  • [26] TEXPLORE: real-time sample-efficient reinforcement learning for robots
    Todd Hester
    Peter Stone
    Machine Learning, 2013, 90 : 385 - 429
  • [27] Sample-efficient model-based reinforcement learning for quantum control
    Khalid, Irtaza
    Weidner, Carrie A.
    Jonckheere, Edmond A.
    Schirmer, Sophie G.
    Langbein, Frank C.
    PHYSICAL REVIEW RESEARCH, 2023, 5 (04):
  • [28] Sample-Efficient Blockage Prediction and Handover Using Causal Reinforcement Learning
    Kanagamani, Tamizharasan
    Sadasivan, Jishnu
    Banerjee, Serene
    10TH INTERNATIONAL CONFERENCE ON ELECTRONICS, COMPUTING AND COMMUNICATION TECHNOLOGIES, CONECCT 2024, 2024,
  • [29] Sample-efficient multi-agent reinforcement learning with masked reconstruction
    Kim, Jung In
    Lee, Young Jae
    Heo, Jongkook
    Park, Jinhyeok
    Kim, Jaehoon
    Lim, Sae Rin
    Jeong, Jinyong
    Kim, Seoung Bum
    PLOS ONE, 2023, 18 (09):
  • [30] Sample-Efficient Cardinality Estimation Using Geometric Deep Learning
    Reiner, Silvan
    Grossniklaus, Michael
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 17 (04): : 740 - 752