Guided Reinforcement Learning via Sequence Learning

被引：0

作者：

Ramamurthy, Rajkumar ^{[1
]}

Sifa, Rafet ^{[1
]}

Luebbering, Max ^{[1
]}

Bauckhage, Christian ^{[1
]}

机构：

[1] Fraunhofer IAIS, St Augustin, Germany

来源：

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT II | 2020年 / 12397卷

关键词：

Reinforcement Learning; Exploration; Novelty Search; Representation learning; Sequence learning;

D O I：

10.1007/978-3-030-61616-8_27

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Applications of Reinforcement Learning (RL) suffer from high sample complexity due to sparse reward signals and inadequate exploration. Novelty Search (NS) guides as an auxiliary task, in this regard to encourage exploration towards unseen behaviors. However, NS suffers from critical drawbacks concerning scalability and generalizability since they are based off instance learning. Addressing these challenges, we previously proposed a generic approach using unsupervised learning to learn representations of agent behaviors and use reconstruction losses as novelty scores. However, it considered only fixed-length sequences and did not utilize sequential information of behaviors. Therefore, we here extend this approach by using sequential auto-encoders to incorporate sequential dependencies. Experimental results on benchmark tasks show that this sequence learning aids exploration outperforming previous novelty search methods.

引用

页码：335 / 345

页数：11

共 50 条

[41] Materials discovery with extreme properties via reinforcement learning-guided combinatorial chemistry
Kim, Hyunseung
Choi, Haeyeon
Kang, Dongju
Lee, Won Bo
Na, Jonggeol
CHEMICAL SCIENCE, 2024, 15 (21) : 7908 - 7925
[42] Stable Online Computation Offloading via Lyapunov-guided Deep Reinforcement Learning
Bi, Suzhi
Huang, Liang
Wang, Hui
Zhang, Ying-Jun Angela
IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2021), 2021,
[43] A Guided Deep Reinforcement Learning Method For Distribution Voltage Regulation via Battery Systems
Huang, Xiaoge
Ding, Zhenhuan
Zhang, Ziang
2021 IEEE POWER & ENERGY SOCIETY INNOVATIVE SMART GRID TECHNOLOGIES CONFERENCE (ISGT), 2021,
[44] Traffic navigation via reinforcement learning with episodic-guided prioritized experience replay
Hassani, Hossein
Nikan, Soodeh
Shami, Abdallah
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 137
[45] ROI-Constrained Bidding via Curriculum-Guided Bayesian Reinforcement Learning
Wang, Haozhe
Du, Chao
Fang, Panyan
Yuan, Shuo
He, Xuming
Wang, Liang
Zheng, Bo
PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 4021 - 4031
[46] Improving reinforcement learning by using sequence trees
Girgin, Sertan
Polat, Faruk
Alhajj, Reda
MACHINE LEARNING, 2010, 81 (03) : 283 - 331
[47] Sequence labeling with reinforcement learning and ranking algorithms
Maes, Francis
Denoyer, Ludovic
Gallinari, Patrick
MACHINE LEARNING: ECML 2007, PROCEEDINGS, 2007, 4701 : 648 - +
[48] Reinforcement Learning in Latent Action Sequence Space
Kim, Heecheol
Yamada, Masanori
Miyoshi, Kosuke
Iwata, Tomoharu
Yamakawa, Hiroshi
2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 5497 - 5503
[49] Reinforcement learning for disassembly sequence planning optimization
Allagui, Amal
Belhadj, Imen
Plateaux, Regis
Hammadi, Moncef
Penas, Olivia
Aifaoui, Nizar
COMPUTERS IN INDUSTRY, 2023, 151
[50] Improving reinforcement learning by using sequence trees
Sertan Girgin
Faruk Polat
Reda Alhajj
Machine Learning, 2010, 81 : 283 - 331

← 1 2 3 4 5 →