Active Learning for Reward Estimation in Inverse Reinforcement Learning

被引:0
|
作者
Lopes, Manuel [1 ]
Melo, Francisco [2 ]
Montesano, Luis [3 ]
机构
[1] Univ Tecn Lisboa, Inst Sistemas & Robotica, Inst Super Tecn, Lisbon, Portugal
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[3] Univ Zaragoza, Zaragoza, Spain
来源
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT II | 2009年 / 5782卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Inverse reinforcement learning addresses the general problem of recovering a reward function from samples of a policy provided by all expert/demonstrator. In this paper, we introduce active learning for inverse reinforcement learning. We propose an algorithm that allows the agent to query the demonstrator for samples at specific states, instead of relying only on samples provided at "arbitrary" states. The purpose of our algorithm is to estimate the reward function with similar. accuracy as other methods from the literature while reducing the amount of policy samples required from the expert. We also discuss the use of our algorithm in higher dimensional problems, using both Monte Carlo and gradient; methods. We present illustrative results of our algorithm in several simulated example's of different complexities.
引用
收藏
页码:31 / +
页数:2
相关论文
共 50 条
  • [1] Reward Identification in Inverse Reinforcement Learning
    Kim, Kuno
    Garg, Shivam
    Shiragur, Kirankumar
    Ermon, Stefano
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [2] Compatible Reward Inverse Reinforcement Learning
    Metelli, Alberto Maria
    Pirotta, Matteo
    Restelli, Marcello
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [3] Reinforcement Learning for Data Preparation with Active Reward Learning
    Berti-Equille, Laure
    INTERNET SCIENCE, INSCI 2019, 2019, 11938 : 121 - 132
  • [4] Option compatible reward inverse reinforcement learning
    Hwang, Rakhoon
    Lee, Hanjin
    Hwang, Hyung Ju
    PATTERN RECOGNITION LETTERS, 2022, 154 : 83 - 89
  • [5] Inverse Reinforcement Learning with the Average Reward Criterion
    Wu, Feiyang
    Ke, Jingyang
    Wu, Anqi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [6] Improved Reward Estimation for Efficient Robot Navigation Using Inverse Reinforcement Learning
    Saha, Olimpiya
    Dasgupta, Prithviraj
    2017 NASA/ESA CONFERENCE ON ADAPTIVE HARDWARE AND SYSTEMS (AHS), 2017, : 245 - 252
  • [7] Active Exploration for Inverse Reinforcement Learning
    Lindner, David
    Krause, Andreas
    Ramponi, Giorgia
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [8] Inverse Reinforcement Learning with Locally Consistent Reward Functions
    Quoc Phong Nguyen
    Low, Kian Hsiang
    Jaillet, Patrick
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [9] Bayesian Inverse Reinforcement Learning-based Reward Learning for Automated Driving
    Zeng, Di
    Zheng, Ling
    Li, Yinong
    Yang, Xiantong
    Jixie Gongcheng Xuebao/Journal of Mechanical Engineering, 2024, 60 (10): : 245 - 260
  • [10] Bavesian inverse reinforcement learning for demonstrations of an expert in multiple dynamics: Toward estimation of transferable reward
    Yusukc N.
    Sachiyo A.
    Transactions of the Japanese Society for Artificial Intelligence, 2020, 35 (01)