A Review of Inverse Reinforcement Learning Theory and Recent Advances

被引:0
|
作者
Shao Zhifei [1 ]
Joo, Er Meng [1 ]
机构
[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore, Singapore
来源
2012 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC) | 2012年
关键词
Reinforcement learning; inverse reinforcement learning; reward function; expert demonstration; ROBOT;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
A major challenge faced by machine learning community is the decision making problems under uncertainty. Reinforcement Learning (RL) techniques provide a powerful solution for it. An agent used by RL interacts with a dynamic environment and finds a policy through a reward function, without using target labels like Supervised Learning (SL). However, one fundamental assumption of existing RL algorithms is that reward function, the most succinct representation of the designer's intention, needs to be provided beforehand. In practice, the reward function can be very hard to specify and exhaustive to tune for large and complex problems, and this inspires the development of Inverse Reinforcement Learning (IRL), an extension of RL, which directly tackles this problem by learning the reward function through expert demonstrations. IRL introduces a new way of learning policies by deriving expert's intentions, in contrast to directly learning policies, which can be redundant and have poor generalization ability. In this paper, the original IRL algorithms and its close variants, as well as their recent advances are reviewed and compared.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Score-based Inverse Reinforcement Learning
    El Asri, Layla
    Piot, Bilal
    Geist, Matthieu
    Laroche, Romain
    Pietquin, Olivier
    AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 457 - 465
  • [22] Sensitivity-Based Inverse Reinforcement Learning
    Tao, Zhaorong
    Chen, Zhichao
    Li, Yanjie
    2013 32ND CHINESE CONTROL CONFERENCE (CCC), 2013, : 2856 - 2861
  • [23] Online Inverse Reinforcement Learning Under Occlusion
    Arora, Saurabh
    Doshi, Prashant
    Banerjee, Bikramjit
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1170 - 1178
  • [24] Option compatible reward inverse reinforcement learning
    Hwang, Rakhoon
    Lee, Hanjin
    Hwang, Hyung Ju
    PATTERN RECOGNITION LETTERS, 2022, 154 : 83 - 89
  • [25] Neural inverse reinforcement learning in autonomous navigation
    Xia, Chen
    El Kamel, Abdelkader
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2016, 84 : 1 - 14
  • [26] Inverse Reinforcement Learning based on Critical State
    Hwang, Kao-Shing
    Cheng, Tien-Yu
    Jiang, Wei-Cheng
    PROCEEDINGS OF THE 2015 CONFERENCE OF THE INTERNATIONAL FUZZY SYSTEMS ASSOCIATION AND THE EUROPEAN SOCIETY FOR FUZZY LOGIC AND TECHNOLOGY, 2015, 89 : 771 - 775
  • [27] Off-Dynamics Inverse Reinforcement Learning
    Kang, Yachen
    Liu, Jinxin
    Wang, Donglin
    IEEE ACCESS, 2024, 12 : 65117 - 65127
  • [28] Decentralized multi-agent reinforcement learning with networked agents: recent advances
    Zhang, Kaiqing
    Yang, Zhuoran
    Basar, Tamer
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2021, 22 (06) : 802 - 814
  • [29] Adaboost-like Method for Inverse Reinforcement Learning
    Hwang, Kao-Shing
    Chiang, Hsuan-yi
    Jiang, Wei-Cheng
    2016 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2016, : 1922 - 1925
  • [30] An Unified Approach to Inverse Reinforcement Learning by Oppositive Demonstrations
    Hwang, Kao-Shing
    Jiang, Wei-Cheng
    Tseng, Yi-Chia
    PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), 2016, : 1664 - 1668