The Improved Algorithm of Deep Q-learning Network Based on Eligibility Trace

被引：0

作者：

Liu, Bingyan ^{[1
,2
]}

Ye, Xiongbing ^{[1
]}

Zhou, Chifei ^{[1
]}

Liu, Yijing ^{[1
]}

Zhang, Qiyang ^{[2
]}

Dong, Fang ^{[2
]}

机构：

[1] Acad Mil Sci, Beijing 100091, Peoples R China

[2] 32032 Troops, Beijing 100094, Peoples R China

来源：

2020 6TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND ROBOTICS (ICCAR) | 2020年

关键词：

reinforcement learning; deep Q-learning network; eligibility trace; automatic control;

D O I：

10.1109/iccar49639.2020.9108040

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

At present, Deep Q-learning Network has become an important research direction in reinforcement learning. However, in practical application, the Deep Q-learning Network always overestimates the action value under certain conditions and has a high cost. In this paper, a new improved algorithm is proposed. The new improved algorithm uses the behavioral qualification tracking of each state for experience playback, so as to find the samples we need to learn more effectively. The behavior eligibility trace is considered in the process of Max calculation, and the problem of overestimation is solved effectively. In the process of optimizer training, trace fading is considered to effectively improve the learning effect and accelerate the convergence of the algorithm. The simulation results of different algorithms applied to the inverted pendulum system show that the new algorithm has better convergence effect, lower cost and lower overestimation than the Natural Deep Q-learning Network. The experimental results show that the new algorithm plays an active role in deep reinforcement learning and has a bright future.

引用

页码：230 / 235

页数：6

共 50 条

[1] Network Selection Algorithm Based on Improved Deep Q-Learning
Ma Bin
Chen Haibo
Zhang Chao
JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2022, 44 (01) : 346 - 353
[2] An Improved Spray and Wait Algorithm Based on Q-learning in Delay Tolerant Network
Yao, Lei
Bai, Xiangyu
Zhou, Kexin
2024 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN 2024, 2024,
[3] An improved immune Q-learning algorithm
Ji, Zhengqiao
Wu, Q. M. Jonathan
Sid-Ahmed, Maher
2007 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-8, 2007, : 3330 - +
[4] Q-Learning Algorithm Based on Incremental RBF Network
Hu Y.
Li D.
He Y.
Han J.
Jiqiren/Robot, 2019, 41 (05): : 562 - 573
[5] A Path Planning Algorithm for UAV Based on Improved Q-Learning
Yan, Chao
Xiang, Xiaojia
2018 2ND INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION SCIENCES (ICRAS), 2018, : 46 - 50
[6] Distribution Network Reconfiguration Based on NoisyNet Deep Q-Learning Network
Wang, Beibei
Zhu, Hong
Xu, Honghua
Bao, Yuqing
Di, Huifang
IEEE ACCESS, 2021, 9 : 90358 - 90365
[7] A Deep Q-Learning Based UAV Detouring Algorithm in a Constrained Wireless Sensor Network Environment
Rahman, Shakila
Akter, Shathee
Yoon, Seokhoon
ELECTRONICS, 2025, 14 (01):
[8] Trading ETFs with Deep Q-Learning Algorithm
Hong, Shao-Yan
Liu, Chien-Hung
Chen, Woei-Kae
You, Shingchern D.
2020 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN (ICCE-TAIWAN), 2020,
[9] Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning
Tan, Fuxiao
Yan, Pengfei
Guan, Xinping
NEURAL INFORMATION PROCESSING (ICONIP 2017), PT IV, 2017, 10637 : 475 - 483
[10] An End-to-End Network Slicing Algorithm Based on Deep Q-Learning for 5G Network
Li, Taihui
Zhu, Xiaorong
Liu, Xu
IEEE ACCESS, 2020, 8 : 122229 - 122240

← 1 2 3 4 5 →