Inverse reinforcement learning from summary data

被引:0
|
作者
Antti Kangasrääsiö
Samuel Kaski
机构
[1] Aalto University,Department of Computer Science
来源
Machine Learning | 2018年 / 107卷
关键词
Inverse reinforcement learning; Bayesian inference; Monte-Carlo estimation; Approximate Bayesian computation;
D O I
暂无
中图分类号
学科分类号
摘要
Inverse reinforcement learning (IRL) aims to explain observed strategic behavior by fitting reinforcement learning models to behavioral data. However, traditional IRL methods are only applicable when the observations are in the form of state-action paths. This assumption may not hold in many real-world modeling settings, where only partial or summarized observations are available. In general, we may assume that there is a summarizing function σ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma $$\end{document}, which acts as a filter between us and the true state-action paths that constitute the demonstration. Some initial approaches to extending IRL to such situations have been presented, but with very specific assumptions about the structure of σ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma $$\end{document}, such as that only certain state observations are missing. This paper instead focuses on the most general case of the problem, where no assumptions are made about the summarizing function, except that it can be evaluated. We demonstrate that inference is still possible. The paper presents exact and approximate inference algorithms that allow full posterior inference, which is particularly important for assessing parameter uncertainty in this challenging inference situation. Empirical scalability is demonstrated to reasonably sized problems, and practical applicability is demonstrated by estimating the posterior for a cognitive science RL model based on an observed user’s task completion time only.
引用
收藏
页码:1517 / 1535
页数:18
相关论文
共 50 条
  • [1] Inverse reinforcement learning from summary data
    Kangasraasio, Antti
    Kaski, Samuel
    MACHINE LEARNING, 2018, 107 (8-10) : 1517 - 1535
  • [2] Online inverse reinforcement learning with limited data
    Self, Ryan
    Mahmud, S. M. Nahid
    Hareland, Katrine
    Kamalapurkar, Rushikesh
    2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 603 - 608
  • [3] Unsupervised Inverse Reinforcement Learning with Noisy Data
    Surana, Amit
    2014 IEEE 53RD ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2014, : 4938 - 4945
  • [4] Inverse Reinforcement Learning from Failure
    Shiarlis, Kyriacos
    Messias, Joao
    Whiteson, Shimon
    AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 1060 - 1068
  • [5] Ordinal Inverse Reinforcement Learning Applied to Robot Learning with Small Data
    Colome, Adria
    Torras, Carme
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 2490 - 2496
  • [6] Learning Fairness from Demonstrations via Inverse Reinforcement Learning
    Blandin, Jack
    Kash, Ian
    PROCEEDINGS OF THE 2024 ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, ACM FACCT 2024, 2024, : 51 - 61
  • [7] Inverse-Inverse Reinforcement Learning. How to Hide Strategy from an Adversarial Inverse Reinforcement Learner
    Pattanayak, Kunal
    Krishnamurthy, Vikram
    Berry, Christopher
    2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 3631 - 3636
  • [8] Learning from Demonstration for Shaping through Inverse Reinforcement Learning
    Suay, Halit Bener
    Brys, Tim
    Taylor, Matthew E.
    Chernova, Sonia
    AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 429 - 437
  • [9] Dialogue Generation: From Imitation Learning to Inverse Reinforcement Learning
    Li, Ziming
    Kiseleva, Julia
    de Rijke, Maarten
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 6722 - 6729
  • [10] Expectation-Maximization for Inverse Reinforcement Learning with Hidden Data
    Bogert, Kenneth
    Lin, Jonathan Feng-Shun
    Doshi, Prashant
    Kulic, Dana
    AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 1034 - 1042