Active Learning for Reward Estimation in Inverse Reinforcement Learning

被引：0

作者：

Lopes, Manuel ^{[1
]}

Melo, Francisco ^{[2
]}

Montesano, Luis ^{[3
]}

机构：

[1] Univ Tecn Lisboa, Inst Sistemas & Robotica, Inst Super Tecn, Lisbon, Portugal

[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

[3] Univ Zaragoza, Zaragoza, Spain

来源：

MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT II | 2009年 / 5782卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Inverse reinforcement learning addresses the general problem of recovering a reward function from samples of a policy provided by all expert/demonstrator. In this paper, we introduce active learning for inverse reinforcement learning. We propose an algorithm that allows the agent to query the demonstrator for samples at specific states, instead of relying only on samples provided at "arbitrary" states. The purpose of our algorithm is to estimate the reward function with similar. accuracy as other methods from the literature while reducing the amount of policy samples required from the expert. We also discuss the use of our algorithm in higher dimensional problems, using both Monte Carlo and gradient; methods. We present illustrative results of our algorithm in several simulated example's of different complexities.

引用

页码：31 / +

页数：2

共 50 条

[1] Reward Identification in Inverse Reinforcement Learning
Kim, Kuno
Garg, Shivam
Shiragur, Kirankumar
Ermon, Stefano
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[2] Compatible Reward Inverse Reinforcement Learning
Metelli, Alberto Maria
Pirotta, Matteo
Restelli, Marcello
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[3] Reinforcement Learning for Data Preparation with Active Reward Learning
Berti-Equille, Laure
INTERNET SCIENCE, INSCI 2019, 2019, 11938 : 121 - 132
[4] Option compatible reward inverse reinforcement learning
Hwang, Rakhoon
Lee, Hanjin
Hwang, Hyung Ju
PATTERN RECOGNITION LETTERS, 2022, 154 : 83 - 89
[5] Inverse Reinforcement Learning with the Average Reward Criterion
Wu, Feiyang
Ke, Jingyang
Wu, Anqi
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[6] Improved Reward Estimation for Efficient Robot Navigation Using Inverse Reinforcement Learning
Saha, Olimpiya
Dasgupta, Prithviraj
2017 NASA/ESA CONFERENCE ON ADAPTIVE HARDWARE AND SYSTEMS (AHS), 2017, : 245 - 252
[7] Active Exploration for Inverse Reinforcement Learning
Lindner, David
Krause, Andreas
Ramponi, Giorgia
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[8] Inverse Reinforcement Learning with Locally Consistent Reward Functions
Quoc Phong Nguyen
Low, Kian Hsiang
Jaillet, Patrick
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
[9] Bayesian Inverse Reinforcement Learning-based Reward Learning for Automated Driving
Zeng, Di
Zheng, Ling
Li, Yinong
Yang, Xiantong
Jixie Gongcheng Xuebao/Journal of Mechanical Engineering, 2024, 60 (10): : 245 - 260
[10] Bavesian inverse reinforcement learning for demonstrations of an expert in multiple dynamics: Toward estimation of transferable reward
Yusukc N.
Sachiyo A.
Transactions of the Japanese Society for Artificial Intelligence, 2020, 35 (01)

← 1 2 3 4 5 →