Z-Score Experience Replay in Off-Policy Deep Reinforcement Learning

被引:0
|
作者
Yang, Yana [1 ]
Xi, Meng [1 ]
Dai, Huiao [1 ]
Wen, Jiabao [1 ]
Yang, Jiachen [1 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
deep reinforcement learning; off policy; priority experience replay; z-score;
D O I
10.3390/s24237746
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Reinforcement learning, as a machine learning method that does not require pre-training data, seeks the optimal policy through the continuous interaction between an agent and its environment. It is an important approach to solving sequential decision-making problems. By combining it with deep learning, deep reinforcement learning possesses powerful perception and decision-making capabilities and has been widely applied to various domains to tackle complex decision problems. Off-policy reinforcement learning separates exploration and exploitation by storing and replaying interaction experiences, making it easier to find global optimal solutions. Understanding how to utilize experiences is crucial for improving the efficiency of off-policy reinforcement learning algorithms. To address this problem, this paper proposes Z-Score Prioritized Experience Replay, which enhances the utilization of experiences and improves the performance and convergence speed of the algorithm. A series of ablation experiments demonstrate that the proposed method significantly improves the effectiveness of deep reinforcement learning algorithms.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Research on Experience Replay of Off-policy Deep Reinforcement Learning: A Review
    Hu Z.-J.
    Gao X.-G.
    Wan K.-F.
    Zhang L.-T.
    Wang Q.-L.
    Neretin E.
    Zidonghua Xuebao/Acta Automatica Sinica, 2023, 49 (11): : 2237 - 2256
  • [2] Enhanced Off-Policy Reinforcement Learning With Focused Experience Replay
    Kong, Seung-Hyun
    Nahrendra, I. Made Aswin
    Paek, Dong-Hee
    IEEE ACCESS, 2021, 9 (09): : 93152 - 93164
  • [3] High-Value Prioritized Experience Replay for Off-policy Reinforcement Learning
    Cao, Xi
    Wan, Huaiyu
    Lin, Youfang
    Han, Sheng
    2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 1510 - 1514
  • [4] Reliability assessment of off-policy deep reinforcement learning: A benchmark for aerodynamics
    Berger, Sandrine
    Ramo, Andrea Arroyo
    Guillet, Valentin
    Lahire, Thibault
    Martin, Brice
    Jardin, Thierry
    Rachelson, Emmanuel
    DATA-CENTRIC ENGINEERING, 2024, 5
  • [5] Off-Policy Correction for Deep Deterministic Policy Gradient Algorithms via Batch Prioritized Experience Replay
    Cicek, Dogan C.
    Duran, Enes
    Saglam, Baturay
    Mutlu, Furkan B.
    Kozat, Suleyman S.
    2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 1255 - 1262
  • [6] Off-Policy Deep Reinforcement Learning Based on Steffensen Value Iteration
    Cheng, Yuhu
    Chen, Lin
    Chen, C. L. Philip
    Wang, Xuesong
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2021, 13 (04) : 1023 - 1032
  • [7] Off-Policy Differentiable Logic Reinforcement Learning
    Zhang, Li
    Li, Xin
    Wang, Mingzhong
    Tian, Andong
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021: RESEARCH TRACK, PT II, 2021, 12976 : 617 - 632
  • [8] Model-Based Off-Policy Deep Reinforcement Learning With Model-Embedding
    Tan, Xiaoyu
    Qu, Chao
    Xiong, Junwu
    Zhang, James
    Qiu, Xihe
    Jin, Yaochu
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (04): : 2974 - 2986
  • [9] Relative importance sampling for off-policy actor-critic in deep reinforcement learning
    Mahammad Humayoo
    Gengzhong Zheng
    Xiaoqing Dong
    Liming Miao
    Shuwei Qiu
    Zexun Zhou
    Peitao Wang
    Zakir Ullah
    Naveed Ur Rehman Junejo
    Xueqi Cheng
    Scientific Reports, 15 (1)
  • [10] An Off-Policy Trust Region Policy Optimization Method With Monotonic Improvement Guarantee for Deep Reinforcement Learning
    Meng, Wenjia
    Zheng, Qian
    Shi, Yue
    Pan, Gang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (05) : 2223 - 2235