Estimating consistent reward of expert in multiple dynamics via linear programming inverse reinforcement learning

被引:0
|
作者
Nakata Y. [1 ]
Arai S. [2 ]
机构
[1] Department of Urban Environment Systems, Graduate School of Science and Engineering, Chiba University
[2] Department of Urban Environment Systems, Graduate School of Engineering, Chiba University
关键词
Inverse reinforcement learning; Linear programming;
D O I
10.1527/tjsai.B-J23
中图分类号
学科分类号
摘要
Reinforcement learning is a powerful framework for decision making and control, but it requires a manually specified reward function. Inverse reinforcement learning (IRL) automatically recovers reward function from policy or demonstrations of experts. Most of the existing IRL algorithms assume that the expert policy or demonstrations in a fixed environment is given, but there are cases these are collected in multiple environments. In this work, we propose an IRL algorithm that is guaranteed to recover reward functions from models of multiple environments and expert policy for each environment. We assume that the expert in multiple environments shares a reward function, and estimate reward functions for which each expert policy is optimal in the corresponding environment. To handle policies in multiple environments, we extend linear programming IRL. Our method solves the linear programming problem of maximizing the sum of the original objective functions of each environment while satisfying conditions of all the given environments. Satisfying conditions of all the given environments is a necessary condition to match with expert reward, and estimated reward by proposed method satisfies this necessary condition. In the experiment, using Windy grid world environments, we demonstrate that our algorithm is able to recover reward functions for which expert policies are optimal for corresponding environments. © 2019, Japanese Society for Artificial Intelligence. All rights reserved.
引用
收藏
相关论文
共 50 条
  • [31] An Inverse Reinforcement Learning Method to Infer Reward Function of Intelligent Jammer
    Fan, Youlin
    Jiu, Bo
    Pu, Wenqiang
    Li, Kang
    Zhang, Yu
    Liu, Hongwei
    Proceedings of the IEEE Radar Conference, 2023,
  • [32] A Hierarchical Bayesian Approach to Inverse Reinforcement Learning with Symbolic Reward Machines
    Zhou, Weichao
    Li, Wenchao
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [33] Towards Interpretable Deep Reinforcement Learning Models via Inverse Reinforcement Learning
    Xie, Yuansheng
    Vosoughi, Soroush
    Hassanpour, Saeed
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 5067 - 5074
  • [34] On Reward-Free Reinforcement Learning with Linear Function Approximation
    Wang, Ruosong
    Du, Simon S.
    Yang, Lin F.
    Salakhutdinov, Ruslan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [35] Reward-Relevance-Filtered Linear Offline Reinforcement Learning
    Zhou, Angela
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [36] Inverse reinforcement learning methods for linear differential games
    Asl, Hamed Jabbari
    Uchibe, Eiji
    SYSTEMS & CONTROL LETTERS, 2024, 193
  • [37] Inverse Reinforcement Learning Control for Linear Multiplayer Games
    Lian, Bosen
    Donge, Vrushabh S.
    Lewis, Frank L.
    Chai, Tianyou
    Davoudi, Ali
    2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 2839 - 2844
  • [38] Linear inverse reinforcement learning in continuous time and space
    Kamalapurkar, Rushikesh
    2018 ANNUAL AMERICAN CONTROL CONFERENCE (ACC), 2018, : 1683 - 1688
  • [39] Learning Fairness from Demonstrations via Inverse Reinforcement Learning
    Blandin, Jack
    Kash, Ian
    PROCEEDINGS OF THE 2024 ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, ACM FACCT 2024, 2024, : 51 - 61
  • [40] Learning Tasks in Intelligent Environments via Inverse Reinforcement Learning
    Shah, Syed Ihtesham Hussain
    Coronato, Antonio
    2021 17TH INTERNATIONAL CONFERENCE ON INTELLIGENT ENVIRONMENTS (IE), 2021,