Deep Reinforcement Learning with Sarsa and Q-Learning: A Hybrid Approach

被引：17

作者：

Xu, Zhi-xiong ^{[1
]}

Cao, Lei ^{[1
]}

Chen, Xi-liang ^{[1
]}

Li, Chen-xi ^{[1
]}

Zhang, Yong-liang ^{[1
]}

Lai, Jun ^{[1
]}

机构：

[1] PLA Univ Sci & Technol, Inst Command Informat Syst, Nanjing 100190, Jiangsu, Peoples R China

来源：

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2018年 / E101D卷 / 09期

关键词：

deep reinforcement learning; Deep Q Network; overestimation; double estimator; Sarsa;

D O I：

10.1587/transinf.2017EDP7278

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The commonly used Deep Q Networks is known to overestimate action values under certain conditions. It's also proved that overestimations do harm to performance, which might cause instability and divergence of learning. In this paper, we present the Deep Sarsa and Q Networks (DSQN) algorithm, which can considered as an enhancement to the Deep Q Networks algorithm. First, DSQN algorithm takes advantage of the experience replay and target network techniques in Deep Q Networks to improve the stability of neural networks. Second, double estimator is utilized for Q-learning to reduce overestimations. Especially, we introduce Sarsa learning to Deep Q Networks for removing overestimations further. Finally, DSQN algorithm is evaluated on cart-pole balancing, mountain car and lunarlander control task from the OpenAI Gym. The empirical evaluation results show that the proposed method leads to reduced overestimations, more stable learning process and improved performance.

引用

页码：2315 / 2322

页数：8

共 50 条

[1] Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning
Tan, Fuxiao
Yan, Pengfei
Guan, Xinping
NEURAL INFORMATION PROCESSING (ICONIP 2017), PT IV, 2017, 10637 : 475 - 483
[2] Comparison of Deep Q-Learning, Q-Learning and SARSA Reinforced Learning for Robot Local Navigation
Anas, Hafiq
Ong, Wee Hong
Malik, Owais Ahmed
ROBOT INTELLIGENCE TECHNOLOGY AND APPLICATIONS 6, 2022, 429 : 443 - 454
[3] Backward Q-learning: The combination of Sarsa algorithm and Q-learning
Wang, Yin-Hao
Li, Tzuu-Hseng S.
Lin, Chih-Jui
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (09) : 2184 - 2193
[4] Deep Reinforcement Learning with Double Q-Learning
van Hasselt, Hado
Guez, Arthur
Silver, David
THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2094 - 2100
[5] Deep Q-Learning Based Reinforcement Learning Approach for Network Intrusion Detection
Alavizadeh, Hooman
Alavizadeh, Hootan
Jang-Jaccard, Julian
COMPUTERS, 2022, 11 (03)
[6] Q-learning based Reinforcement Learning Approach for Lane Keeping
Feher, Arpad
Aradi, Szilard
Becsi, Tamas
2018 18TH IEEE INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND INFORMATICS (CINTI), 2018, : 31 - 35
[7] Enhanced Machine Learning Algorithms: Deep Learning, Reinforcement Learning, ana Q-Learning
Park, Ji Su
Park, Jong Hyuk
JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2020, 16 (05): : 1001 - 1007
[8] Energy management based on reinforcement learning with double deep Q-learning for a hybrid electric tracked vehicle
Han, Xuefeng
He, Hongwen
Wu, Jingda
Peng, Jiankun
Li, Yuecheng
APPLIED ENERGY, 2019, 254
[9] Comparative analysis of Q-learning, SARSA, and deep Q-network for microgrid energy management
Ramesh, Sreyas
Sukanth, B. N.
Sathyavarapu, Sri Jaswanth
Sharma, Vishwash
Kumaar, A. A. Nippun
Khanna, Manju
SCIENTIFIC REPORTS, 2025, 15 (01):
[10] Analysis of the influence of the rate of learning and the factor of discount on the performance of Q-learning and SARSA algorithms: application of learning by reinforcement in autonomous navigation
Carvalho Ottoni, Andre Luiz
Nepomuceno, Erivelton Geraldo
de Oliveira, Marcos Santos
Cordeiro, Lara Toledo
Lamperti, Rubisson Duarte
REVISTA BRASILEIRA DE COMPUTACAO APLICADA, 2016, 8 (02): : 44 - 59

← 1 2 3 4 5 →