Inverse reinforcement learning from summary data

被引:0
|
作者
Antti Kangasrääsiö
Samuel Kaski
机构
[1] Aalto University,Department of Computer Science
来源
Machine Learning | 2018年 / 107卷
关键词
Inverse reinforcement learning; Bayesian inference; Monte-Carlo estimation; Approximate Bayesian computation;
D O I
暂无
中图分类号
学科分类号
摘要
Inverse reinforcement learning (IRL) aims to explain observed strategic behavior by fitting reinforcement learning models to behavioral data. However, traditional IRL methods are only applicable when the observations are in the form of state-action paths. This assumption may not hold in many real-world modeling settings, where only partial or summarized observations are available. In general, we may assume that there is a summarizing function σ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma $$\end{document}, which acts as a filter between us and the true state-action paths that constitute the demonstration. Some initial approaches to extending IRL to such situations have been presented, but with very specific assumptions about the structure of σ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma $$\end{document}, such as that only certain state observations are missing. This paper instead focuses on the most general case of the problem, where no assumptions are made about the summarizing function, except that it can be evaluated. We demonstrate that inference is still possible. The paper presents exact and approximate inference algorithms that allow full posterior inference, which is particularly important for assessing parameter uncertainty in this challenging inference situation. Empirical scalability is demonstrated to reasonably sized problems, and practical applicability is demonstrated by estimating the posterior for a cognitive science RL model based on an observed user’s task completion time only.
引用
收藏
页码:1517 / 1535
页数:18
相关论文
共 50 条
  • [41] Neural inverse reinforcement learning in autonomous navigation
    Xia, Chen
    El Kamel, Abdelkader
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2016, 84 : 1 - 14
  • [42] Off-Dynamics Inverse Reinforcement Learning
    Kang, Yachen
    Liu, Jinxin
    Wang, Donglin
    IEEE ACCESS, 2024, 12 : 65117 - 65127
  • [43] Inverse Reinforcement Learning in Partially Observable Environments
    Choi, Jaedeug
    Kim, Kee-Eung
    JOURNAL OF MACHINE LEARNING RESEARCH, 2011, 12 : 691 - 730
  • [44] Score-based Inverse Reinforcement Learning
    El Asri, Layla
    Piot, Bilal
    Geist, Matthieu
    Laroche, Romain
    Pietquin, Olivier
    AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 457 - 465
  • [45] Active Advice Seeking for Inverse Reinforcement Learning
    Odom, Phillip
    Natarajan, Sriraam
    AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 503 - 511
  • [46] Learning a Generic Olfactory Search Strategy From Silk Moths by Deep Inverse Reinforcement Learning
    Hernandez-Reyes, Cesar
    Shigaki, Shunsuke
    Yamada, Mayu
    Kondo, Takeshi
    Kurabayashi, Daisuke
    IEEE TRANSACTIONS ON MEDICAL ROBOTICS AND BIONICS, 2022, 4 (01): : 241 - 253
  • [47] Learning strategies in table tennis using inverse reinforcement learning
    Muelling, Katharina
    Boularias, Abdeslam
    Mohler, Betty
    Schoelkopf, Bernhard
    Peters, Jan
    BIOLOGICAL CYBERNETICS, 2014, 108 (05) : 603 - 619
  • [48] Inverse Reinforcement Learning Based Stochastic Driver Behavior Learning
    Ozkan, Mehmet F.
    Rocque, Abishek J.
    Ma, Yao
    IFAC PAPERSONLINE, 2021, 54 (20): : 882 - 888
  • [49] Learning Aircraft Pilot Skills by Adversarial Inverse Reinforcement Learning
    Suzuki, Kaito
    Uemura, Tsuneharu
    Tsuchiya, Takeshi
    Beppu, Hirofumi
    Hazui, Yusuke
    Ono, Hitoi
    2023 ASIA-PACIFIC INTERNATIONAL SYMPOSIUM ON AEROSPACE TECHNOLOGY, VOL I, APISAT 2023, 2024, 1050 : 1431 - 1441
  • [50] Methodologies for Imitation Learning via Inverse Reinforcement Learning: A Review
    Zhang K.
    Yu Y.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2019, 56 (02): : 254 - 261