Inverse reinforcement learning from summary data

被引：0

作者：

Antti Kangasrääsiö

Samuel Kaski

机构：

[1] Aalto University,Department of Computer Science

来源：

Machine Learning | 2018年 / 107卷

关键词：

Inverse reinforcement learning; Bayesian inference; Monte-Carlo estimation; Approximate Bayesian computation;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Inverse reinforcement learning (IRL) aims to explain observed strategic behavior by fitting reinforcement learning models to behavioral data. However, traditional IRL methods are only applicable when the observations are in the form of state-action paths. This assumption may not hold in many real-world modeling settings, where only partial or summarized observations are available. In general, we may assume that there is a summarizing function σ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma $$\end{document}, which acts as a filter between us and the true state-action paths that constitute the demonstration. Some initial approaches to extending IRL to such situations have been presented, but with very specific assumptions about the structure of σ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma $$\end{document}, such as that only certain state observations are missing. This paper instead focuses on the most general case of the problem, where no assumptions are made about the summarizing function, except that it can be evaluated. We demonstrate that inference is still possible. The paper presents exact and approximate inference algorithms that allow full posterior inference, which is particularly important for assessing parameter uncertainty in this challenging inference situation. Empirical scalability is demonstrated to reasonably sized problems, and practical applicability is demonstrated by estimating the posterior for a cognitive science RL model based on an observed user’s task completion time only.

引用

页码：1517 / 1535

页数：18

共 50 条

[1] Inverse reinforcement learning from summary data
Kangasraasio, Antti
Kaski, Samuel
MACHINE LEARNING, 2018, 107 (8-10) : 1517 - 1535
[2] Online inverse reinforcement learning with limited data
Self, Ryan
Mahmud, S. M. Nahid
Hareland, Katrine
Kamalapurkar, Rushikesh
2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 603 - 608
[3] Unsupervised Inverse Reinforcement Learning with Noisy Data
Surana, Amit
2014 IEEE 53RD ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2014, : 4938 - 4945
[4] Inverse Reinforcement Learning from Failure
Shiarlis, Kyriacos
Messias, Joao
Whiteson, Shimon
AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 1060 - 1068
[5] Ordinal Inverse Reinforcement Learning Applied to Robot Learning with Small Data
Colome, Adria
Torras, Carme
2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 2490 - 2496
[6] Learning Fairness from Demonstrations via Inverse Reinforcement Learning
Blandin, Jack
Kash, Ian
PROCEEDINGS OF THE 2024 ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, ACM FACCT 2024, 2024, : 51 - 61
[7] Inverse-Inverse Reinforcement Learning. How to Hide Strategy from an Adversarial Inverse Reinforcement Learner
Pattanayak, Kunal
Krishnamurthy, Vikram
Berry, Christopher
2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 3631 - 3636
[8] Learning from Demonstration for Shaping through Inverse Reinforcement Learning
Suay, Halit Bener
Brys, Tim
Taylor, Matthew E.
Chernova, Sonia
AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 429 - 437
[9] Dialogue Generation: From Imitation Learning to Inverse Reinforcement Learning
Li, Ziming
Kiseleva, Julia
de Rijke, Maarten
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 6722 - 6729
[10] Expectation-Maximization for Inverse Reinforcement Learning with Hidden Data
Bogert, Kenneth
Lin, Jonathan Feng-Shun
Doshi, Prashant
Kulic, Dana
AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 1034 - 1042

← 1 2 3 4 5 →