Incorporating Explanations to Balance the Exploration and Exploitation of Deep Reinforcement Learning

被引:1
|
作者
Wang, Xinzhi [1 ]
Liu, Yang [1 ]
Chang, Yudong [1 ]
Jiang, Chao [1 ]
Zhang, Qingjie [1 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai, Peoples R China
来源
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT II | 2022年 / 13369卷
关键词
Deep reinforcement learning; Explainable AI; Explanation fusion; Variational auto encoder; LEVEL;
D O I
10.1007/978-3-031-10986-7_16
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Discovering efficient exploration strategies is a central challenge in reinforcement learning (RL). Deep reinforcement learning (DRL) methods proposed in recent years have mainly focused on improving the generalization of models while ignoring models' explanation. In this study, an embedding explanation for the advantage actor-critic algorithm (EEA2C) is proposed to balance the relationship between exploration and exploitation for DRL models. Specifically, the proposed algorithm explains agent's actions before employing explanation to guide exploration. A fusion strategy is then designed to retain information that is helpful for exploration from experience. Based on the results of the fusion strategy, a variational autoencoder (VAE) is designed to encode the task-related explanation into a probabilistic latent representation. The latent representation of the VAE is finally incorporated into the agent's policy as prior knowledge. Experimental results for six Atari environments show that the proposed method improves the agent's exploratory capabilities with explainable knowledge.
引用
收藏
页码:200 / 211
页数:12
相关论文
共 50 条
  • [1] Exploration in deep reinforcement learning: A survey
    Ladosz, Pawel
    Weng, Lilian
    Kim, Minwoo
    Oh, Hyondong
    INFORMATION FUSION, 2022, 85 : 1 - 22
  • [2] Automating post-exploitation with deep reinforcement learning
    Maeda, Ryusei
    Mimura, Mamoru
    COMPUTERS & SECURITY, 2021, 100
  • [3] Counterfactual state explanations for reinforcement learning agents via generative deep learning
    Olson, Matthew L.
    Khanna, Roli
    Neal, Lawrence
    Li, Fuxin
    Wong, Weng-Keen
    ARTIFICIAL INTELLIGENCE, 2021, 295
  • [4] Two-Stage Evolutionary Reinforcement Learning for Enhancing Exploration and Exploitation
    Zhu, Qingling
    Wu, Xiaoqiang
    Lin, Qiuzhen
    Chen, Wei-Neng
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 18, 2024, : 20892 - 20900
  • [5] Deep Reinforcement Learning with Noisy Exploration for Autonomous Driving
    Li, Ruyang
    Zhang, Yaqiang
    Zhao, Yaqian
    Wei, Hui
    Xu, Zhe
    Zhao, Kun
    PROCEEDINGS OF 2022 THE 6TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND SOFT COMPUTING, ICMLSC 20222, 2022, : 8 - 14
  • [6] Deep Reinforcement Learning with Risk-Seeking Exploration
    Dilokthanakul, Nat
    Shanahan, Murray
    FROM ANIMALS TO ANIMATS 15, 2018, 10994 : 201 - 211
  • [7] Improving exploration in deep reinforcement learning for stock trading
    Zemzem, Wiem
    Tagina, Moncef
    INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2023, 72 (04) : 288 - 295
  • [8] Exploring the Impact of Simple Explanations and Agency on Batch Deep Reinforcement Learning Induced Pedagogical Policies
    Ausin, Markel Sanz
    Maniktala, Mehak
    Barnes, Tiffany
    Chi, Min
    ARTIFICIAL INTELLIGENCE IN EDUCATION (AIED 2020), PT I, 2020, 12163 : 472 - 485
  • [9] Sample Efficient Reinforcement Learning via Model-Ensemble Exploration and Exploitation
    Yao, Yao
    Xiao, Li
    An, Zhicheng
    Zhang, Wanpeng
    Luo, Dijun
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4202 - 4208
  • [10] Fast Robot Hierarchical Exploration Based on Deep Reinforcement Learning
    Zuo, Shun
    Niu, Jianwei
    Ren, Lu
    Ouyang, Zhenchao
    2023 INTERNATIONAL WIRELESS COMMUNICATIONS AND MOBILE COMPUTING, IWCMC, 2023, : 138 - 143