The Improved Algorithm of Deep Q-learning Network Based on Eligibility Trace

被引:0
|
作者
Liu, Bingyan [1 ,2 ]
Ye, Xiongbing [1 ]
Zhou, Chifei [1 ]
Liu, Yijing [1 ]
Zhang, Qiyang [2 ]
Dong, Fang [2 ]
机构
[1] Acad Mil Sci, Beijing 100091, Peoples R China
[2] 32032 Troops, Beijing 100094, Peoples R China
来源
2020 6TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND ROBOTICS (ICCAR) | 2020年
关键词
reinforcement learning; deep Q-learning network; eligibility trace; automatic control;
D O I
10.1109/iccar49639.2020.9108040
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
At present, Deep Q-learning Network has become an important research direction in reinforcement learning. However, in practical application, the Deep Q-learning Network always overestimates the action value under certain conditions and has a high cost. In this paper, a new improved algorithm is proposed. The new improved algorithm uses the behavioral qualification tracking of each state for experience playback, so as to find the samples we need to learn more effectively. The behavior eligibility trace is considered in the process of Max calculation, and the problem of overestimation is solved effectively. In the process of optimizer training, trace fading is considered to effectively improve the learning effect and accelerate the convergence of the algorithm. The simulation results of different algorithms applied to the inverted pendulum system show that the new algorithm has better convergence effect, lower cost and lower overestimation than the Natural Deep Q-learning Network. The experimental results show that the new algorithm plays an active role in deep reinforcement learning and has a bright future.
引用
收藏
页码:230 / 235
页数:6
相关论文
共 50 条
  • [1] Network Selection Algorithm Based on Improved Deep Q-Learning
    Ma Bin
    Chen Haibo
    Zhang Chao
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2022, 44 (01) : 346 - 353
  • [2] An Improved Spray and Wait Algorithm Based on Q-learning in Delay Tolerant Network
    Yao, Lei
    Bai, Xiangyu
    Zhou, Kexin
    2024 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN 2024, 2024,
  • [3] An improved immune Q-learning algorithm
    Ji, Zhengqiao
    Wu, Q. M. Jonathan
    Sid-Ahmed, Maher
    2007 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-8, 2007, : 3330 - +
  • [4] Q-Learning Algorithm Based on Incremental RBF Network
    Hu Y.
    Li D.
    He Y.
    Han J.
    Jiqiren/Robot, 2019, 41 (05): : 562 - 573
  • [5] A Path Planning Algorithm for UAV Based on Improved Q-Learning
    Yan, Chao
    Xiang, Xiaojia
    2018 2ND INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION SCIENCES (ICRAS), 2018, : 46 - 50
  • [6] Distribution Network Reconfiguration Based on NoisyNet Deep Q-Learning Network
    Wang, Beibei
    Zhu, Hong
    Xu, Honghua
    Bao, Yuqing
    Di, Huifang
    IEEE ACCESS, 2021, 9 : 90358 - 90365
  • [7] A Deep Q-Learning Based UAV Detouring Algorithm in a Constrained Wireless Sensor Network Environment
    Rahman, Shakila
    Akter, Shathee
    Yoon, Seokhoon
    ELECTRONICS, 2025, 14 (01):
  • [8] Trading ETFs with Deep Q-Learning Algorithm
    Hong, Shao-Yan
    Liu, Chien-Hung
    Chen, Woei-Kae
    You, Shingchern D.
    2020 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN (ICCE-TAIWAN), 2020,
  • [9] Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning
    Tan, Fuxiao
    Yan, Pengfei
    Guan, Xinping
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT IV, 2017, 10637 : 475 - 483
  • [10] An End-to-End Network Slicing Algorithm Based on Deep Q-Learning for 5G Network
    Li, Taihui
    Zhu, Xiaorong
    Liu, Xu
    IEEE ACCESS, 2020, 8 : 122229 - 122240