Improving Reinforcement Learning Exploration by Autoencoders

被引:0
|
作者
Paczolay, Gabor [1 ]
Harmati, Istvan [1 ]
机构
[1] Department of Control Engineering, Budapest University of Technology and Economics, Magyar Tudósok körútja 2., Budapest
来源
Periodica Polytechnica Electrical Engineering and Computer Science | 2024年 / 68卷 / 04期
关键词
AutE-DQN; autoencoders; DQN; exploration; reinforcement learning;
D O I
10.3311/PPee.36789
中图分类号
学科分类号
摘要
Reinforcement learning is a field with massive potential related to solving engineering problems without field knowledge. However, the problem of exploration and exploitation emerges when one tries to balance a system between the learning phase and proper execution. In this paper, a new method is proposed that utilizes autoencoders to manage the exploration rate in an epsilon-greedy exploration algorithm. The error between the real state and the reconstructed state by the autoencoder becomes the base of the exploration-exploitation rate. The proposed method is then examined in two experiments: one benchmark is the cartpole experiment while the other is a gridworld example created for this paper to examine long-term exploration. Both experiments show results such that the proposed method performs better in these scenarios. © 2024 Budapest University of Technology and Economics. All rights reserved.
引用
收藏
页码:335 / 343
页数:8
相关论文
共 50 条
  • [41] Rényi State Entropy Maximization for Exploration Acceleration in Reinforcement Learning
    Yuan M.
    Pun M.-O.
    Wang D.
    IEEE Transactions on Artificial Intelligence, 2023, 4 (05): : 1154 - 1164
  • [42] Fast and slow curiosity for high-level exploration in reinforcement learning
    Bougie, Nicolas
    Ichise, Ryutaro
    APPLIED INTELLIGENCE, 2021, 51 (02) : 1086 - 1107
  • [43] Autonomous exploration through deep reinforcement learning
    Yan, Xiangda
    Huang, Jie
    He, Keyan
    Hong, Huajie
    Xu, Dasheng
    INDUSTRIAL ROBOT-THE INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH AND APPLICATION, 2023, 50 (05): : 793 - 803
  • [44] Softmax exploration strategies for multiobjective reinforcement learning
    Vamplew, Peter
    Dazeley, Richard
    Foale, Cameron
    NEUROCOMPUTING, 2017, 263 : 74 - 86
  • [45] A Robust Exploration Strategy in Reinforcement Learning Based on Temporal Difference Error
    Hajar, Muhammad Shadi
    Kalutarage, Harsha
    Al-Kadri, M. Omar
    AI 2022: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, 13728 : 789 - 799
  • [46] Balancing exploration and exploitation in episodic reinforcement learning
    Chen, Qihang
    Zhang, Qiwei
    Liu, Yunlong
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 231
  • [47] Improving the Performance of Autonomous Driving through Deep Reinforcement Learning
    Tammewar, Akshaj
    Chaudhari, Nikita
    Saini, Bunny
    Venkatesh, Divya
    Dharahas, Ganpathiraju
    Vora, Deepali
    Patil, Shruti
    Kotecha, Ketan
    Alfarhood, Sultan
    SUSTAINABILITY, 2023, 15 (18)
  • [48] Balancing Exploration and Exploitation Ratio in Reinforcement Learning
    Ozcan, Ozkan
    de Moraes, Claudio Coreixas
    Alt, Jonathan
    MILITARY MODELING & SIMULATION SYMPOSIUM 2011 (MMS 2011) - 2011 SPRING SIMULATION MULTICONFERENCE - BK 7 OF 8, 2011, : 126 - 131
  • [49] Learning Transferable Domain Priors for Safe Exploration in Reinforcement Learning
    Karimpanal, Thommen George
    Rana, Santu
    Gupta, Sunil
    Truyen Tran
    Venkatesh, Svetha
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [50] Transformer Decoder-Based Enhanced Exploration Method to Alleviate Initial Exploration Problems in Reinforcement Learning
    Kyoung, Dohyun
    Sung, Yunsick
    SENSORS, 2023, 23 (17)