Improving Reinforcement Learning Exploration by Autoencoders

被引：0

作者：

Paczolay, Gabor ^{[1
]}

Harmati, Istvan ^{[1
]}

机构：

[1] Department of Control Engineering, Budapest University of Technology and Economics, Magyar Tudósok körútja 2., Budapest

来源：

Periodica Polytechnica Electrical Engineering and Computer Science | 2024年 / 68卷 / 04期

关键词：

AutE-DQN; autoencoders; DQN; exploration; reinforcement learning;

D O I：

10.3311/PPee.36789

中图分类号：

学科分类号：

摘要：

Reinforcement learning is a field with massive potential related to solving engineering problems without field knowledge. However, the problem of exploration and exploitation emerges when one tries to balance a system between the learning phase and proper execution. In this paper, a new method is proposed that utilizes autoencoders to manage the exploration rate in an epsilon-greedy exploration algorithm. The error between the real state and the reconstructed state by the autoencoder becomes the base of the exploration-exploitation rate. The proposed method is then examined in two experiments: one benchmark is the cartpole experiment while the other is a gridworld example created for this paper to examine long-term exploration. Both experiments show results such that the proposed method performs better in these scenarios. © 2024 Budapest University of Technology and Economics. All rights reserved.

引用

页码：335 / 343

页数：8

共 50 条

[41] Rényi State Entropy Maximization for Exploration Acceleration in Reinforcement Learning
Yuan M.
Pun M.-O.
Wang D.
IEEE Transactions on Artificial Intelligence, 2023, 4 (05): : 1154 - 1164
[42] Fast and slow curiosity for high-level exploration in reinforcement learning
Bougie, Nicolas
Ichise, Ryutaro
APPLIED INTELLIGENCE, 2021, 51 (02) : 1086 - 1107
[43] Autonomous exploration through deep reinforcement learning
Yan, Xiangda
Huang, Jie
He, Keyan
Hong, Huajie
Xu, Dasheng
INDUSTRIAL ROBOT-THE INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH AND APPLICATION, 2023, 50 (05): : 793 - 803
[44] Softmax exploration strategies for multiobjective reinforcement learning
Vamplew, Peter
Dazeley, Richard
Foale, Cameron
NEUROCOMPUTING, 2017, 263 : 74 - 86
[45] A Robust Exploration Strategy in Reinforcement Learning Based on Temporal Difference Error
Hajar, Muhammad Shadi
Kalutarage, Harsha
Al-Kadri, M. Omar
AI 2022: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, 13728 : 789 - 799
[46] Balancing exploration and exploitation in episodic reinforcement learning
Chen, Qihang
Zhang, Qiwei
Liu, Yunlong
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 231
[47] Improving the Performance of Autonomous Driving through Deep Reinforcement Learning
Tammewar, Akshaj
Chaudhari, Nikita
Saini, Bunny
Venkatesh, Divya
Dharahas, Ganpathiraju
Vora, Deepali
Patil, Shruti
Kotecha, Ketan
Alfarhood, Sultan
SUSTAINABILITY, 2023, 15 (18)
[48] Balancing Exploration and Exploitation Ratio in Reinforcement Learning
Ozcan, Ozkan
de Moraes, Claudio Coreixas
Alt, Jonathan
MILITARY MODELING & SIMULATION SYMPOSIUM 2011 (MMS 2011) - 2011 SPRING SIMULATION MULTICONFERENCE - BK 7 OF 8, 2011, : 126 - 131
[49] Learning Transferable Domain Priors for Safe Exploration in Reinforcement Learning
Karimpanal, Thommen George
Rana, Santu
Gupta, Sunil
Truyen Tran
Venkatesh, Svetha
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[50] Transformer Decoder-Based Enhanced Exploration Method to Alleviate Initial Exploration Problems in Reinforcement Learning
Kyoung, Dohyun
Sung, Yunsick
SENSORS, 2023, 23 (17)

← 1 2 3 4 5 →