Transfer in Inverse Reinforcement Learning for Multiple Strategies

被引:0
|
作者
Tanwani, Ajay Kumar [1 ]
Billard, Aude [1 ]
机构
[1] Ecole Polytech Fed Lausanne, LASA, CH-1015 Lausanne, Switzerland
来源
2013 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) | 2013年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider the problem of incrementally learning different strategies of performing a complex sequential task from multiple demonstrations of an expert or a set of experts. While the task is the same, each expert differs in his/her way of performing it. We assume that this variety across experts' demonstration is due to the fact that each expert/strategy is driven by a different reward function, where reward function is expressed as a linear combination of a set of known features. Consequently, we can learn all the expert strategies by forming a convex set of optimal deterministic policies, from which one can match any unseen expert strategy drawn from this set. Instead of learning from scratch every optimal policy in this set, the learner transfers knowledge from the set of learned policies to bootstrap its search for new optimal policy. We demonstrate our approach on a simulated mini-golf task where the 7 degrees of freedom Barrett WAM robot arm learns to sequentially putt on different holes in accordance with the playing strategies of the expert.
引用
收藏
页码:3244 / 3250
页数:7
相关论文
共 50 条
  • [1] Learning strategies in table tennis using inverse reinforcement learning
    Muelling, Katharina
    Boularias, Abdeslam
    Mohler, Betty
    Schoelkopf, Bernhard
    Peters, Jan
    BIOLOGICAL CYBERNETICS, 2014, 108 (05) : 603 - 619
  • [2] Learning strategies in table tennis using inverse reinforcement learning
    Katharina Muelling
    Abdeslam Boularias
    Betty Mohler
    Bernhard Schölkopf
    Jan Peters
    Biological Cybernetics, 2014, 108 : 603 - 619
  • [3] Identification of animal behavioral strategies by inverse reinforcement learning
    Yamaguchi, Shoichiro
    Naoki, Honda
    Ikeda, Muneki
    Tsukada, Yuki
    Nakano, Shunji
    Mori, Ikue
    Ishii, Shin
    PLOS COMPUTATIONAL BIOLOGY, 2018, 14 (05):
  • [4] Identifiability and Generalizability from Multiple Experts in Inverse Reinforcement Learning
    Rolland, Paul
    Viano, Luca
    Schurhoff, Norman
    Nikolov, Boris
    Cevher, Volkan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [5] Contextual Action with Multiple Policies Inverse Reinforcement Learning for Behavior Simulation
    Alvarez, Nahum
    Noda, Itsuki
    PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE (ICAART), VOL 2, 2019, : 887 - 894
  • [6] Adversarial Inverse Reinforcement Learning to Estimate Policies from Multiple Experts
    Yamashita K.
    Hamagami T.
    Yamashita, Kodai, 2021, Institute of Electrical Engineers of Japan (141) : 1405 - 1410
  • [7] Strategies for simulating pedestrian navigation with multiple reinforcement learning agents
    Martinez-Gil, Francisco
    Lozano, Miguel
    Fernandez, Fernando
    AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2015, 29 (01) : 98 - 130
  • [8] Strategies for simulating pedestrian navigation with multiple reinforcement learning agents
    Francisco Martinez-Gil
    Miguel Lozano
    Fernando Fernández
    Autonomous Agents and Multi-Agent Systems, 2015, 29 : 98 - 130
  • [9] Hierarchical reinforcement learning of multiple grasping strategies with human instructions
    Osa, T.
    Peters, Jan
    Neumann, G.
    ADVANCED ROBOTICS, 2018, 32 (18) : 955 - 968
  • [10] Inverse Reinforcement Learning of Behavioral Models for Online-Adapting Navigation Strategies
    Herman, Michael
    Fiseher, Volker
    Gindele, Tobias
    Burgard, Wolfram
    2015 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2015, : 3215 - 3222