Estimating consistent reward of expert in multiple dynamics via linear programming inverse reinforcement learning

被引:0
|
作者
Nakata Y. [1 ]
Arai S. [2 ]
机构
[1] Department of Urban Environment Systems, Graduate School of Science and Engineering, Chiba University
[2] Department of Urban Environment Systems, Graduate School of Engineering, Chiba University
关键词
Inverse reinforcement learning; Linear programming;
D O I
10.1527/tjsai.B-J23
中图分类号
学科分类号
摘要
Reinforcement learning is a powerful framework for decision making and control, but it requires a manually specified reward function. Inverse reinforcement learning (IRL) automatically recovers reward function from policy or demonstrations of experts. Most of the existing IRL algorithms assume that the expert policy or demonstrations in a fixed environment is given, but there are cases these are collected in multiple environments. In this work, we propose an IRL algorithm that is guaranteed to recover reward functions from models of multiple environments and expert policy for each environment. We assume that the expert in multiple environments shares a reward function, and estimate reward functions for which each expert policy is optimal in the corresponding environment. To handle policies in multiple environments, we extend linear programming IRL. Our method solves the linear programming problem of maximizing the sum of the original objective functions of each environment while satisfying conditions of all the given environments. Satisfying conditions of all the given environments is a necessary condition to match with expert reward, and estimated reward by proposed method satisfies this necessary condition. In the experiment, using Windy grid world environments, we demonstrate that our algorithm is able to recover reward functions for which expert policies are optimal for corresponding environments. © 2019, Japanese Society for Artificial Intelligence. All rights reserved.
引用
收藏
相关论文
共 50 条
  • [41] Methodologies for Imitation Learning via Inverse Reinforcement Learning: A Review
    Zhang K.
    Yu Y.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2019, 56 (02): : 254 - 261
  • [42] Graphical Model Parameter Learning by Inverse Linear Programming
    Trajkovska, Vera
    Swoboda, Paul
    Astroem, Freddie
    Petra, Stefania
    SCALE SPACE AND VARIATIONAL METHODS IN COMPUTER VISION, SSVM 2017, 2017, 10302 : 323 - 334
  • [43] Ensemble inverse reinforcement learning from semi-expert agents
    Tomita, Shinji
    Hamatsu, Fumiya
    Hamagami, Tomoki
    IEEJ Transactions on Electronics, Information and Systems, 2017, 137 (04): : 667 - 673
  • [44] Adaptively Shaping Reinforcement Learning Agents via Human Reward
    Yu, Chao
    Wang, Dongxu
    Yang, Tianpei
    Zhu, Wenxuan
    Li, Yuchen
    Ge, Hongwei
    Ren, Jiankang
    PRICAI 2018: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I, 2018, 11012 : 85 - 97
  • [45] Generalized Maximum Entropy Reinforcement Learning via Reward Shaping
    Tao F.
    Wu M.
    Cao Y.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (04): : 1563 - 1572
  • [46] Point Cloud Registration via Heuristic Reward Reinforcement Learning
    Chen, Bingren
    STATS, 2023, 6 (01): : 268 - 278
  • [47] Guaranteeing Control Requirements via Reward Shaping in Reinforcement Learning
    De Lellis, Francesco
    Coraggio, Marco
    Russo, Giovanni
    Musolesi, Mirco
    di Bernardo, Mario
    IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, 2024, 32 (06) : 2102 - 2113
  • [48] Inverse Reinforcement Learning of Interaction Dynamics from Demonstrations
    Hussein, Mostafa
    Begum, Momotaz
    Petrik, Marek
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 2267 - 2274
  • [49] Inverse Reinforcement Learning with Simultaneous Estimation of Rewards and Dynamics
    Herman, Michael
    Gindele, Tobias
    Wagner, Joerg
    Schmitt, Felix
    Burgard, Wolfram
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 102 - 110
  • [50] Reward Function Design for Crowd Simulation via Reinforcement Learning
    Kwiatkowski, Ariel
    Kalogeiton, Vicky
    Pettre, Julien
    Cani, Marie-Paule
    15TH ANNUAL ACM SIGGRAPH CONFERENCE ON MOTION, INTERACTION AND GAMES, MIG 2023, 2023,