Guided Reinforcement Learning via Sequence Learning

被引：0

作者：

Ramamurthy, Rajkumar ^{[1
]}

Sifa, Rafet ^{[1
]}

Luebbering, Max ^{[1
]}

Bauckhage, Christian ^{[1
]}

机构：

[1] Fraunhofer IAIS, St Augustin, Germany

来源：

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT II | 2020年 / 12397卷

关键词：

Reinforcement Learning; Exploration; Novelty Search; Representation learning; Sequence learning;

D O I：

10.1007/978-3-030-61616-8_27

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Applications of Reinforcement Learning (RL) suffer from high sample complexity due to sparse reward signals and inadequate exploration. Novelty Search (NS) guides as an auxiliary task, in this regard to encourage exploration towards unseen behaviors. However, NS suffers from critical drawbacks concerning scalability and generalizability since they are based off instance learning. Addressing these challenges, we previously proposed a generic approach using unsupervised learning to learn representations of agent behaviors and use reconstruction losses as novelty scores. However, it considered only fixed-length sequences and did not utilize sequential information of behaviors. Therefore, we here extend this approach by using sequential auto-encoders to incorporate sequential dependencies. Experimental results on benchmark tasks show that this sequence learning aids exploration outperforming previous novelty search methods.

引用

页码：335 / 345

页数：11

共 50 条

[21] Reinforcement learning for disassembly sequence planning optimization [J].

Allagui, Amal ;

Belhadj, Imen ;

Plateaux, Regis ;

Hammadi, Moncef ;

Penas, Olivia ;

Aifaoui, Nizar .

COMPUTERS IN INDUSTRY, 2023, 151

[22] Improving reinforcement learning by using sequence trees [J].

Sertan Girgin ;

Faruk Polat ;

Reda Alhajj .

Machine Learning, 2010, 81 :283-331

[23] Improving reinforcement learning by using sequence trees [J].

Girgin, Sertan ;

Polat, Faruk ;

Alhajj, Reda .

MACHINE LEARNING, 2010, 81 (03) :283-331

[24] Reinforcement Learning in Latent Action Sequence Space [J].

Kim, Heecheol ;

Yamada, Masanori ;

Miyoshi, Kosuke ;

Iwata, Tomoharu ;

Yamakawa, Hiroshi .

2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, :5497-5503

[25] Learning in Games via Reinforcement and Regularization [J].

Mertikopoulos, Panayotis ;

Sandholm, William H. .

MATHEMATICS OF OPERATIONS RESEARCH, 2016, 41 (04) :1297-1324

[26] Ambulance Redeployment via Reinforcement Learning [J].

Sahin, Umitcan ;

Yucesoy, Veysel .

2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,

[27] Evolutionary Multitasking via Reinforcement Learning [J].

Li, Shuijia ;

Gong, Wenyin ;

Wang, Ling ;

Gu, Qiong .

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (01) :762-775

[28] Bayesian Deep Reinforcement Learning via Deep Kernel Learning [J].

Junyu Xuan ;

Jie Lu ;

Zheng Yan ;

Guangquan Zhang .

International Journal of Computational Intelligence Systems, 2018, 12 :164-171

[29] Learning to Navigate in Human Environments via Deep Reinforcement Learning [J].

Gao, Xingyuan ;

Sun, Shiying ;

Zhao, Xiaoguang ;

Tan, Min .

NEURAL INFORMATION PROCESSING (ICONIP 2019), PT I, 2019, 11953 :418-429

[30] Distributed Reinforcement Learning via Gossip [J].

Mathkar, Adwaitvedant ;

Borkar, Vivek S. .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (03) :1465-1470

← 1 2 3 4 5 →