Incorporating Explanations to Balance the Exploration and Exploitation of Deep Reinforcement Learning

被引：1

作者：

Wang, Xinzhi ^{[1
]}

Liu, Yang ^{[1
]}

Chang, Yudong ^{[1
]}

Jiang, Chao ^{[1
]}

Zhang, Qingjie ^{[1
]}

机构：

[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai, Peoples R China

来源：

KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT II | 2022年 / 13369卷

关键词：

Deep reinforcement learning; Explainable AI; Explanation fusion; Variational auto encoder; LEVEL;

D O I：

10.1007/978-3-031-10986-7_16

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Discovering efficient exploration strategies is a central challenge in reinforcement learning (RL). Deep reinforcement learning (DRL) methods proposed in recent years have mainly focused on improving the generalization of models while ignoring models' explanation. In this study, an embedding explanation for the advantage actor-critic algorithm (EEA2C) is proposed to balance the relationship between exploration and exploitation for DRL models. Specifically, the proposed algorithm explains agent's actions before employing explanation to guide exploration. A fusion strategy is then designed to retain information that is helpful for exploration from experience. Based on the results of the fusion strategy, a variational autoencoder (VAE) is designed to encode the task-related explanation into a probabilistic latent representation. The latent representation of the VAE is finally incorporated into the agent's policy as prior knowledge. Experimental results for six Atari environments show that the proposed method improves the agent's exploratory capabilities with explainable knowledge.

引用

页码：200 / 211

页数：12

共 50 条

[1] Exploration in deep reinforcement learning: A survey
Ladosz, Pawel
Weng, Lilian
Kim, Minwoo
Oh, Hyondong
INFORMATION FUSION, 2022, 85 : 1 - 22
[2] Automating post-exploitation with deep reinforcement learning
Maeda, Ryusei
Mimura, Mamoru
COMPUTERS & SECURITY, 2021, 100
[3] Counterfactual state explanations for reinforcement learning agents via generative deep learning
Olson, Matthew L.
Khanna, Roli
Neal, Lawrence
Li, Fuxin
Wong, Weng-Keen
ARTIFICIAL INTELLIGENCE, 2021, 295
[4] Two-Stage Evolutionary Reinforcement Learning for Enhancing Exploration and Exploitation
Zhu, Qingling
Wu, Xiaoqiang
Lin, Qiuzhen
Chen, Wei-Neng
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 18, 2024, : 20892 - 20900
[5] Deep Reinforcement Learning with Noisy Exploration for Autonomous Driving
Li, Ruyang
Zhang, Yaqiang
Zhao, Yaqian
Wei, Hui
Xu, Zhe
Zhao, Kun
PROCEEDINGS OF 2022 THE 6TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND SOFT COMPUTING, ICMLSC 20222, 2022, : 8 - 14
[6] Deep Reinforcement Learning with Risk-Seeking Exploration
Dilokthanakul, Nat
Shanahan, Murray
FROM ANIMALS TO ANIMATS 15, 2018, 10994 : 201 - 211
[7] Improving exploration in deep reinforcement learning for stock trading
Zemzem, Wiem
Tagina, Moncef
INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2023, 72 (04) : 288 - 295
[8] Exploring the Impact of Simple Explanations and Agency on Batch Deep Reinforcement Learning Induced Pedagogical Policies
Ausin, Markel Sanz
Maniktala, Mehak
Barnes, Tiffany
Chi, Min
ARTIFICIAL INTELLIGENCE IN EDUCATION (AIED 2020), PT I, 2020, 12163 : 472 - 485
[9] Sample Efficient Reinforcement Learning via Model-Ensemble Exploration and Exploitation
Yao, Yao
Xiao, Li
An, Zhicheng
Zhang, Wanpeng
Luo, Dijun
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4202 - 4208
[10] Fast Robot Hierarchical Exploration Based on Deep Reinforcement Learning
Zuo, Shun
Niu, Jianwei
Ren, Lu
Ouyang, Zhenchao
2023 INTERNATIONAL WIRELESS COMMUNICATIONS AND MOBILE COMPUTING, IWCMC, 2023, : 138 - 143

← 1 2 3 4 5 →