Inverse reinforcement learning from summary data

被引：0

作者：

Antti Kangasrääsiö

Samuel Kaski

机构：

[1] Aalto University,Department of Computer Science

来源：

Machine Learning | 2018年 / 107卷

关键词：

Inverse reinforcement learning; Bayesian inference; Monte-Carlo estimation; Approximate Bayesian computation;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Inverse reinforcement learning (IRL) aims to explain observed strategic behavior by fitting reinforcement learning models to behavioral data. However, traditional IRL methods are only applicable when the observations are in the form of state-action paths. This assumption may not hold in many real-world modeling settings, where only partial or summarized observations are available. In general, we may assume that there is a summarizing function σ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma $$\end{document}, which acts as a filter between us and the true state-action paths that constitute the demonstration. Some initial approaches to extending IRL to such situations have been presented, but with very specific assumptions about the structure of σ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma $$\end{document}, such as that only certain state observations are missing. This paper instead focuses on the most general case of the problem, where no assumptions are made about the summarizing function, except that it can be evaluated. We demonstrate that inference is still possible. The paper presents exact and approximate inference algorithms that allow full posterior inference, which is particularly important for assessing parameter uncertainty in this challenging inference situation. Empirical scalability is demonstrated to reasonably sized problems, and practical applicability is demonstrated by estimating the posterior for a cognitive science RL model based on an observed user’s task completion time only.

引用

页码：1517 / 1535

页数：18

共 50 条

[41] Neural inverse reinforcement learning in autonomous navigation
Xia, Chen
El Kamel, Abdelkader
ROBOTICS AND AUTONOMOUS SYSTEMS, 2016, 84 : 1 - 14
[42] Off-Dynamics Inverse Reinforcement Learning
Kang, Yachen
Liu, Jinxin
Wang, Donglin
IEEE ACCESS, 2024, 12 : 65117 - 65127
[43] Inverse Reinforcement Learning in Partially Observable Environments
Choi, Jaedeug
Kim, Kee-Eung
JOURNAL OF MACHINE LEARNING RESEARCH, 2011, 12 : 691 - 730
[44] Score-based Inverse Reinforcement Learning
El Asri, Layla
Piot, Bilal
Geist, Matthieu
Laroche, Romain
Pietquin, Olivier
AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 457 - 465
[45] Active Advice Seeking for Inverse Reinforcement Learning
Odom, Phillip
Natarajan, Sriraam
AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 503 - 511
[46] Learning a Generic Olfactory Search Strategy From Silk Moths by Deep Inverse Reinforcement Learning
Hernandez-Reyes, Cesar
Shigaki, Shunsuke
Yamada, Mayu
Kondo, Takeshi
Kurabayashi, Daisuke
IEEE TRANSACTIONS ON MEDICAL ROBOTICS AND BIONICS, 2022, 4 (01): : 241 - 253
[47] Learning strategies in table tennis using inverse reinforcement learning
Muelling, Katharina
Boularias, Abdeslam
Mohler, Betty
Schoelkopf, Bernhard
Peters, Jan
BIOLOGICAL CYBERNETICS, 2014, 108 (05) : 603 - 619
[48] Inverse Reinforcement Learning Based Stochastic Driver Behavior Learning
Ozkan, Mehmet F.
Rocque, Abishek J.
Ma, Yao
IFAC PAPERSONLINE, 2021, 54 (20): : 882 - 888
[49] Learning Aircraft Pilot Skills by Adversarial Inverse Reinforcement Learning
Suzuki, Kaito
Uemura, Tsuneharu
Tsuchiya, Takeshi
Beppu, Hirofumi
Hazui, Yusuke
Ono, Hitoi
2023 ASIA-PACIFIC INTERNATIONAL SYMPOSIUM ON AEROSPACE TECHNOLOGY, VOL I, APISAT 2023, 2024, 1050 : 1431 - 1441
[50] Methodologies for Imitation Learning via Inverse Reinforcement Learning: A Review
Zhang K.
Yu Y.
Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2019, 56 (02): : 254 - 261

← 1 2 3 4 5 →