Z-Score Experience Replay in Off-Policy Deep Reinforcement Learning

被引:0
|
作者
Yang, Yana [1 ]
Xi, Meng [1 ]
Dai, Huiao [1 ]
Wen, Jiabao [1 ]
Yang, Jiachen [1 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
deep reinforcement learning; off policy; priority experience replay; z-score;
D O I
10.3390/s24237746
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Reinforcement learning, as a machine learning method that does not require pre-training data, seeks the optimal policy through the continuous interaction between an agent and its environment. It is an important approach to solving sequential decision-making problems. By combining it with deep learning, deep reinforcement learning possesses powerful perception and decision-making capabilities and has been widely applied to various domains to tackle complex decision problems. Off-policy reinforcement learning separates exploration and exploitation by storing and replaying interaction experiences, making it easier to find global optimal solutions. Understanding how to utilize experiences is crucial for improving the efficiency of off-policy reinforcement learning algorithms. To address this problem, this paper proposes Z-Score Prioritized Experience Replay, which enhances the utilization of experiences and improves the performance and convergence speed of the algorithm. A series of ablation experiments demonstrate that the proposed method significantly improves the effectiveness of deep reinforcement learning algorithms.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] Deep reinforcement learning with hindsight experience replay for dual-arm robot trajectory planning
    Li, Shoutao
    Chu, Zhanggen
    Hu, Zhidong
    Liu, Zhenze
    INTERNATIONAL JOURNAL OF INTELLIGENT ROBOTICS AND APPLICATIONS, 2025,
  • [32] Selective experience replay compression using coresets for lifelong deep reinforcement learning in medical imaging
    Zheng, Guangyao
    Zhou, Samson
    Braverman, Vladimir
    Jacobs, Michael A.
    Parekh, Vishwa S.
    MEDICAL IMAGING WITH DEEP LEARNING, VOL 227, 2023, 227 : 1751 - 1764
  • [33] MALight: A Deep Reinforcement Learning Traffic Light Control Algorithm with Pressure and Attentive Experience Replay
    Kong, Yan
    Li, Ying
    Hsia, Chih-Hsien
    JOURNAL OF INTERNET TECHNOLOGY, 2024, 25 (07): : 955 - 962
  • [34] Smoothed functional-based gradient algorithms for off-policy reinforcement learning: A non-asymptotic viewpoint
    Vijayan, Nithia
    Prashanth, L. A.
    SYSTEMS & CONTROL LETTERS, 2021, 155
  • [35] Efficient Off-policy Adversarial Imitation Learning with Imperfect Demonstrations
    Li, Jiangeng
    Zhao, Qishen
    Huang, Shuai
    Zuo, Guoyu
    PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 1692 - 1697
  • [36] H∞ Optimal Control of Unknown Linear Discrete-time Systems: An Off-policy Reinforcement Learning Approach
    Kiumarsi, Bahare
    Modares, Hamidreza
    Lewis, Frank L.
    Jiang, Zhong-Ping
    PROCEEDINGS OF THE 2015 7TH IEEE INTERNATIONAL CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS (CIS) AND ROBOTICS, AUTOMATION AND MECHATRONICS (RAM), 2015, : 41 - 46
  • [37] A sample efficient model-based deep reinforcement learning algorithm with experience replay for robot manipulation
    Zhang, Cheng
    Ma, Liang
    Schmitz, Alexander
    INTERNATIONAL JOURNAL OF INTELLIGENT ROBOTICS AND APPLICATIONS, 2020, 4 (02) : 217 - 228
  • [38] A sample efficient model-based deep reinforcement learning algorithm with experience replay for robot manipulation
    Cheng Zhang
    Liang Ma
    Alexander Schmitz
    International Journal of Intelligent Robotics and Applications, 2020, 4 : 217 - 228
  • [39] Unveiling the Effects of Experience Replay on Deep Reinforcement Learning-based Power Allocation in Wireless Networks
    Kopic, Amna
    Perenda, Erma
    Gacanin, Haris
    2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024, 2024,
  • [40] Multimodal fusion for autonomous navigation via deep reinforcement learning with sparse rewards and hindsight experience replay
    Xiao, Wendong
    Yuan, Liang
    Ran, Teng
    He, Li
    Zhang, Jianbo
    Cui, Jianping
    DISPLAYS, 2023, 78