ATTENTION-BASED CURIOSITY-DRIVEN EXPLORATION IN DEEP REINFORCEMENT LEARNING

被引:0
作者
Reizinger, Patrik [1 ]
Szemenyei, Marton [1 ]
机构
[1] Budapest Univ Technol & Econ, Dept Control Engn & Informat Technol, Budapest, Hungary
来源
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年
关键词
Reinforcement Learning; curiosity; exploration; attention;
D O I
10.1109/icassp40776.2020.9054546
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Reinforcement Learning enables to train an agent via interaction with the environment. However, in the majority of real-world scenarios, the extrinsic feedback is sparse or not sufficient, thus intrinsic reward formulations are needed to successfully train the agent. This work investigates and extends the paradigm of curiosity-driven exploration. Our aim is to develop means for the better incorporation of state-and/or action-dependent information into existing intrinsic reward formulations. First, a probabilistic approach is taken to exploit the advantages of the attention mechanism, which is successfully applied in other domains of Deep Learning. Combining them, we propose new methods, such as Attention-aided Advantage Actor-Critic, an extension of the Actor-Critic framework. Second, another curiosity-based approach - Intrinsic Curiosity Module - is extended. The proposed model utilizes attention to emphasize features for the dynamic models within Intrinsic Curiosity Module, moreover, we also modify the loss function, resulting in a new curiosity formulation, which we call rational curiosity (RCM).
引用
收藏
页码:3542 / 3546
页数:5
相关论文
共 13 条
[1]  
[Anonymous], 2018, ARXIV180310122
[2]  
[Anonymous], 2018, CORR
[3]  
Brockman G., 2016, ARXIV160601540
[4]  
Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672
[5]  
Hill A., 2018, Stable baselines
[6]  
Mnih V, 2016, PR MACH LEARN RES, V48
[7]  
Paszke A., 2017, P NIPSW, P1
[8]   Curiosity-driven Exploration by Self-supervised Prediction [J].
Pathak, Deepak ;
Agrawal, Pulkit ;
Efros, Alexei A. ;
Darrell, Trevor .
2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, :488-489
[9]  
Pathak Deepak, 2019, ARXIV190604161
[10]  
Pfau David, 2017, CONNECTING GANS ACTO