Bavesian inverse reinforcement learning for demonstrations of an expert in multiple dynamics: Toward estimation of transferable reward

被引:0
|
作者
Yusukc N. [1 ]
Sachiyo A. [2 ]
机构
[1] Department of Urban Environment Systems, Graduate School of Science and Engineering, Chiba University
[2] Department of Urban Environment Systems, Graduate School of Engineering, Chiba University
关键词
Bayesian inference; Inverse reinforcement learning; Markov decision processes; Reinforcement learning;
D O I
10.1527/tjsai.G-J73
中图分类号
学科分类号
摘要
Though a reinforcement learning framework has numerous achievements, it requires a careful shaping of a re-ward function that represents the objective of a task. There is a class of task in which an expert could demonstrate the optimal way of doing, but it is difficult to design a proper reward function. For these tasks, an inverse reinforcement learning approach seems useful because it makes it possible to estimates a reward function from expert's demonstrations. Most existing inverse reinforcement learning algorithms assume that an expert gives demonstrations in a unique environment. However, an expert also could provide demonstrations of tasks within other environments of which have a specific objective function. For example, though it is hard to represent objective explicitly for a driving task, the driver could give demonstrations under multiple situations. In such cases, it is natural to utilize these demonstrations in multiple environments to estimate expert's reward functions. We formulate this problem as Bayesian Inverse Rein-forcement Learning problem and propose a Markov Chain Monte Carlo method for the problem. Experimental results show that the proposed method quantitatively overperforms existing methods. © 2020, Japanese Society for Artificial Intelligence. All rights reserved.
引用
收藏
相关论文
共 50 条
  • [21] Inverse reinforcement learning by expert imitation for the stochastic linear-quadratic optimal control problem
    Sun, Zhongshi
    Jia, Guangyan
    NEUROCOMPUTING, 2025, 633
  • [22] Robust Adaptive Scaffolding with Inverse Reinforcement Learning-Based Reward Design
    Fahid, Fahmid Morshed
    Rowe, Jonathan P.
    Spain, Randall D.
    Goldberg, Benjamin S.
    Pokorny, Robert
    Lester, James
    ARTIFICIAL INTELLIGENCE IN EDUCATION: POSTERS AND LATE BREAKING RESULTS, WORKSHOPS AND TUTORIALS, INDUSTRY AND INNOVATION TRACKS, PRACTITIONERS AND DOCTORAL CONSORTIUM, PT II, 2022, 13356 : 204 - 207
  • [23] Dynamic QoS Prediction With Intelligent Route Estimation Via Inverse Reinforcement Learning
    Li, Jiahui
    Wu, Hao
    He, Qiang
    Zhao, Yiji
    Wang, Xin
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2024, 17 (02) : 509 - 523
  • [24] Model-free inverse reinforcement learning with multi-intention, unlabeled, and overlapping demonstrations
    Ariyan Bighashdel
    Pavol Jancura
    Gijs Dubbelman
    Machine Learning, 2023, 112 : 2263 - 2296
  • [25] Model-free inverse reinforcement learning with multi-intention, unlabeled, and overlapping demonstrations
    Bighashdel, Ariyan
    Jancura, Pavol
    Dubbelman, Gijs
    MACHINE LEARNING, 2023, 112 (07) : 2263 - 2296
  • [26] Boosting Performance of Visual Servoing Using Deep Reinforcement Learning From Multiple Demonstrations
    Aflakian, Ali
    Rastegharpanah, Alireza
    Stolkin, Rustam
    IEEE ACCESS, 2023, 11 : 26512 - 26520
  • [27] Combination of learning from non-optimal demonstrations and feedbacks using inverse reinforcement learning and Bayesian policy improvement
    Ezzeddine, Ali
    Mourad, Nafee
    Araabi, Babak Nadjar
    Ahmadabadi, Majid Nili
    EXPERT SYSTEMS WITH APPLICATIONS, 2018, 112 : 331 - 341
  • [28] Expert-Trajectory-Based Features for Apprenticeship Learning via Inverse Reinforcement Learning for Robotic Manipulation
    Naranjo-Campos, Francisco J.
    Victores, Juan G.
    Balaguer, Carlos
    APPLIED SCIENCES-BASEL, 2024, 14 (23):
  • [29] Advancements in Deep Reinforcement Learning and Inverse Reinforcement Learning for Robotic Manipulation: Toward Trustworthy, Interpretable, and Explainable Artificial Intelligence
    Ozalp, Recep
    Ucar, Aysegul
    Guzelis, Cuneyt
    IEEE ACCESS, 2024, 12 : 51840 - 51858
  • [30] Adversarial Batch Inverse Reinforcement Learning: Learn to Reward from Imperfect Demonstration for Interactive Recommendation
    Liu, Jialin
    Su, Xinyan
    He, Zeyu
    Li, Jun
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 1262 - 1267