Sample-Efficient Deep Reinforcement Learning with Directed Associative Graph

被引:0
|
作者
Yang, Dujia [1 ,2 ]
Qin, Xiaowei [1 ,2 ]
Xu, Xiaodong [1 ,2 ]
Li, Chensheng [1 ,2 ]
Wei, Guo [1 ,2 ]
机构
[1] Univ Sci & Technol China, Hefei 230026, Peoples R China
[2] CAS Key Lab Wireless Opt Commun, Hefei 230027, Peoples R China
关键词
directed associative graph; sample efficiency; deep reinforcement learning;
D O I
暂无
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
Reinforcement learning can be modeled as markov decision process mathematically. In consequence, the interaction samples as well as the connection relation between them are two main types of information for learning. However, most of recent works on deep reinforcement learning treat samples independently either in their own episode or between episodes. In this paper, in order to utilize more sample information, we propose another learning system based on directed associative graph (DAG). The DAG is built on all trajectories in real time, which includes the whole connection relation of all samples among all episodes. Through planning with directed edges on DAG, we offer another perspective to estimate state-action pair, especially for the unknowns to deep neural network (DNN) as well as episodic memory (EM). Mixed loss function is generated by the three learning systems (DNN, EM and DAG) to improve the efficiency of the parameter update in the proposed algorithm. We show that our algorithm is significantly better than the state-of-the-art algorithm in performance and sample efficiency on testing environments. Furthermore, the convergence of our algorithm is proved in the appendix and its long-term performance as well as the effects of DAG are verified.
引用
收藏
页码:100 / 113
页数:14
相关论文
共 50 条
  • [31] Sample-efficient deep learning for accelerating photonic inverse design
    Hegde, Ravi
    OSA CONTINUUM, 2021, 4 (03): : 1019 - 1033
  • [32] Sample-efficient Reinforcement Learning Representation Learning with Curiosity Contrastive Forward Dynamics Model
    Nguyen, Thanh
    Luu, Tung M.
    Vu, Thang
    Yoo, Chang D.
    2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 3471 - 3477
  • [33] Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards
    Metcalf, Katherine
    Sarabia, Miguel
    Mackraz, Natalie
    Theobald, Barry-John
    CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229
  • [34] Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs with a Generative Model
    Wang, Bingyan
    Yan, Yuling
    Fan, Jianqing
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [35] On Sample-Efficient Offline Reinforcement Learning: Data Diversity, Posterior Sampling, and Beyond
    Nguyen-Tang, Thanh
    Arora, Raman
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [36] Sample-Efficient Multimodal Dynamics Modeling for Risk-Sensitive Reinforcement Learning
    Yashima, Ryota
    Yamaguchi, Akihiko
    Hashimoto, Koichi
    2022 8TH INTERNATIONAL CONFERENCE ON MECHATRONICS AND ROBOTICS ENGINEERING (ICMRE 2022), 2022, : 21 - 27
  • [37] Sample-Efficient Reinforcement Learning Is Feasible for Linearly Realizable MDPs with Limited Revisiting
    Li, Gen
    Chen, Yuxin
    Chi, Yuejie
    Gu, Yuantao
    Wei, Yuting
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [38] Sample-Efficient Multimodal Dynamics Modeling for Risk-Sensitive Reinforcement Learning
    Yashima, Ryota
    Yamaguchi, Akihiko
    Hashimoto, Koichi
    2022 8th International Conference on Mechatronics and Robotics Engineering, ICMRE 2022, 2022, : 21 - 27
  • [39] Sample-Efficient Multi-Agent Reinforcement Learning with Demonstrations for Flocking Control
    Qiu, Yunbo
    Zhan, Yuzhu
    Jin, Yue
    Wang, Jian
    Zhang, Xudong
    2022 IEEE 96TH VEHICULAR TECHNOLOGY CONFERENCE (VTC2022-FALL), 2022,
  • [40] Ship course-keeping in waves using sample-efficient reinforcement learning
    Greep, Justin
    Bayezit, Afsin Baran
    Mak, Bart
    Rijpkema, Douwe
    Kinaci, Omer Kemal
    Duz, Bulent
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 141