Improving Reinforcement Learning Exploration by Autoencoders

被引：0

作者：

Paczolay, Gabor ^{[1
]}

Harmati, Istvan ^{[1
]}

机构：

[1] Department of Control Engineering, Budapest University of Technology and Economics, Magyar Tudósok körútja 2., Budapest

来源：

Periodica Polytechnica Electrical Engineering and Computer Science | 2024年 / 68卷 / 04期

关键词：

AutE-DQN; autoencoders; DQN; exploration; reinforcement learning;

D O I：

10.3311/PPee.36789

中图分类号：

学科分类号：

摘要：

Reinforcement learning is a field with massive potential related to solving engineering problems without field knowledge. However, the problem of exploration and exploitation emerges when one tries to balance a system between the learning phase and proper execution. In this paper, a new method is proposed that utilizes autoencoders to manage the exploration rate in an epsilon-greedy exploration algorithm. The error between the real state and the reconstructed state by the autoencoder becomes the base of the exploration-exploitation rate. The proposed method is then examined in two experiments: one benchmark is the cartpole experiment while the other is a gridworld example created for this paper to examine long-term exploration. Both experiments show results such that the proposed method performs better in these scenarios. © 2024 Budapest University of Technology and Economics. All rights reserved.

引用

页码：335 / 343

页数：8

共 50 条

[21] Exploration in deep reinforcement learning: A survey
Ladosz, Pawel
Weng, Lilian
Kim, Minwoo
Oh, Hyondong
INFORMATION FUSION, 2022, 85 : 1 - 22
[22] Uncertainty Quantification and Exploration for Reinforcement Learning
Zhu, Yi
Dong, Jing
Lam, Henry
OPERATIONS RESEARCH, 2024, 72 (04) : 1689 - 1709
[23] EFFICIENT AND STABLE INFORMATION DIRECTED EXPLORATION FOR CONTINUOUS REINFORCEMENT LEARNING
Chen, Mingzhe
Xiao, Xi
Zhang, Wanpeng
Gao, Xiaotian
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4023 - 4027
[24] NAEM: Noisy Attention Exploration Module for Deep Reinforcement Learning
Cai, Zhenwen
Lee, Feifei
Hu, Chunyan
Kotani, Koji
Chen, Qiu
IEEE ACCESS, 2021, 9 : 154600 - 154611
[25] Efficient exploration in reinforcement learning based on utile suffix memory
Pchelkin, A
INFORMATICA, 2003, 14 (02) : 237 - 250
[26] A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning
Garcia, Francisco M.
Thomas, Philip S.
AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1976 - 1978
[27] Exploration in Relational Domains for Model-based Reinforcement Learning
Lang, Tobias
Toussaint, Marc
Kersting, Kristian
JOURNAL OF MACHINE LEARNING RESEARCH, 2012, 13 : 3725 - 3768
[28] Chaotic exploration effects on reinforcement learning in shortcut maze task
Morihiro, Koichiro
Matsui, Nobuyuki
Nishimura, Haruhiko
INTERNATIONAL JOURNAL OF BIFURCATION AND CHAOS, 2006, 16 (10): : 3015 - 3022
[29] Reinforcement Learning Exploration Algorithms for Energy Harvesting Communications Systems
Masadeh, Ala'eddin
Wang, Zhengdao
Kamal, Ahmed E.
2018 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2018,
[30] Learning of deterministic exploration and temporal abstraction in reinforcement learning
Shibata, Katsunari
2006 SICE-ICASE International Joint Conference, Vols 1-13, 2006, : 2212 - 2217

← 1 2 3 4 5 →