Visual Explanation using Attention Mechanism in Actor-Critic-based Deep Reinforcement Learning

被引:15
作者
Itaya, Hidenori [1 ]
Hirakawa, Tsubasa [1 ]
Yamashita, Takayoshi [1 ]
Fujiyoshi, Hironobu [1 ]
Sugiura, Komei [2 ]
机构
[1] Chubu Univ, 1200 Matsumotocho, Kasugai, Aichi, Japan
[2] Keio Univ, Kohoku Ku, 3-14-1 Hiyoshi, Yokohama, Kanagawa, Japan
来源
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2021年
关键词
D O I
10.1109/IJCNN52387.2021.9534363
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep reinforcement learning (DRL) has great potential for acquiring the optimal action in complex environments such as games and robot control. However, it is difficult to analyze the decision-making of the agent, i.e., the reasons it selects the action acquired by learning. In this work, we propose Mask-Attention A3C (Mask A3C), which introduces an attention mechanism into Asynchronous Advantage Actor-Critic (A3C), which is an actor-critic-based DRL method, and can analyze the decision-making of an agent in DRL. A3C consists of a feature extractor that extracts features from an image, a policy branch that outputs the policy, and a value branch that outputs the state value. In this method, we focus on the policy and value branches and introduce an attention mechanism into them. The attention mechanism applies a mask processing to the feature maps of each branch using mask-attention that expresses the judgment reason for the policy and state value with a heat map. We visualized mask-attention maps for games on the Atari 2600 and found we could easily analyze the reasons behind an agent's decision-making in various game tasks. Furthermore, experimental results showed that the agent could achieve a higher performance by introducing the attention mechanism.
引用
收藏
页数:10
相关论文
共 32 条
[1]  
[Anonymous], 2017, INT C MACH LEARN ICM
[2]  
Brockman Greg, 2016, arXiv
[3]   Attention Branch Network: Learning of Attention Mechanism for Visual Explanation [J].
Fukui, Hiroshi ;
Hirakawa, Tsubasa ;
Yamashita, Takayoshi ;
Fujiyoshi, Hironobu .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :10697-10706
[4]  
Greydanus S, 2018, PR MACH LEARN RES, V80
[5]  
Gu S., 2017, 2017 IEEE INT C ROBO, P3389
[6]  
HAARNOJA T, 2018, INT C MACH LEARN ICM, V80
[7]  
Hessel M, 2018, AAAI CONF ARTIF INTE, P3215
[8]  
Jaderberg Max, 2017, ICLR
[9]  
Justesen N., 2017, ARXIV170807902
[10]  
Kapturowski S., 2019, P INT C LEARN REPR