Hierarchical Adversarial Inverse Reinforcement Learning

被引:3
|
作者
Chen, Jiayu [1 ]
Lan, Tian [2 ]
Aggarwal, Vaneet [1 ,3 ]
机构
[1] Purdue Univ, Sch Ind Engn, W Lafayette, IN 47907 USA
[2] George Washington Univ, Dept Elect & Comp Engn, Washington, DC 20052 USA
[3] KAUST, Comp Sci Dept, Thuwal 23955, Saudi Arabia
关键词
Inverse reinforcement learning (IRL); hierarchical imitation learning (HIL); robotic learning;
D O I
10.1109/TNNLS.2023.3305983
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Imitation learning (IL) has been proposed to recover the expert policy from demonstrations. However, it would be difficult to learn a single monolithic policy for highly complex long-horizon tasks of which the expert policy usually contains subtask hierarchies. Therefore, hierarchical IL (HIL) has been developed to learn a hierarchical policy from expert demonstrations through explicitly modeling the activity structure in a task with the option framework. Existing HIL methods either overlook the causal relationship between the subtask structure and the learned policy, or fail to learn the high-level and low-level policy in the hierarchical framework in conjuncture, which leads to suboptimality. In this work, we propose a novel HIL algorithm-hierarchical adversarial inverse reinforcement learning (H-AIRL), which extends a state-of-the-art (SOTA) IL algorithm-AIRL, with the one-step option framework. Specifically, we redefine the AIRL objectives on the extended state and action spaces, and further introduce a directed information term to the objective function to enhance the causality between the low-level policy and its corresponding subtask. Moreover, we propose an expectation-maximization (EM) adaption of our algorithm so that it can be applied to expert demonstrations without the subtask annotations which are more accessible in practice. Theoretical justifications of our algorithm design and evaluations on challenging robotic control tasks are provided to show the superiority of our algorithm compared with SOTA HIL baselines. The codes are available at https://github.com/LucasCJYSDL/HierAIRL.
引用
收藏
页码:17549 / 17558
页数:10
相关论文
共 50 条
  • [1] Multi-task Hierarchical Adversarial Inverse Reinforcement Learning
    Chen, Jiayu
    Tamboli, Dipesh
    Lan, Tian
    Aggarwal, Vaneet
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [2] Multiagent Adversarial Inverse Reinforcement Learning
    Wei, Ermo
    Wicke, Drew
    Luke, Sean
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2265 - 2266
  • [3] Hierarchical Bayesian Inverse Reinforcement Learning
    Choi, Jaedeug
    Kim, Kee-Eung
    IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (04) : 793 - 805
  • [4] Inverse Reinforcement Learning for Adversarial Apprentice Games
    Lian, Bosen
    Xue, Wenqian
    Lewis, Frank L.
    Chai, Tianyou
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (08) : 4596 - 4609
  • [5] Learning Aircraft Pilot Skills by Adversarial Inverse Reinforcement Learning
    Suzuki, Kaito
    Uemura, Tsuneharu
    Tsuchiya, Takeshi
    Beppu, Hirofumi
    Hazui, Yusuke
    Ono, Hitoi
    2023 ASIA-PACIFIC INTERNATIONAL SYMPOSIUM ON AEROSPACE TECHNOLOGY, VOL I, APISAT 2023, 2024, 1050 : 1431 - 1441
  • [6] Multi-Agent Adversarial Inverse Reinforcement Learning
    Yu, Lantao
    Song, Jiaming
    Ermon, Stefano
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [7] Inverse-Inverse Reinforcement Learning. How to Hide Strategy from an Adversarial Inverse Reinforcement Learner
    Pattanayak, Kunal
    Krishnamurthy, Vikram
    Berry, Christopher
    2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 3631 - 3636
  • [8] Adaptive generative adversarial maximum entropy inverse reinforcement learning
    Song, Li
    Li, Dazi
    Xu, Xin
    INFORMATION SCIENCES, 2025, 695
  • [9] Online inverse reinforcement learning for nonlinear systems with adversarial attacks
    Lian, Bosen
    Xue, Wenqian
    Lewis, Frank L.
    Chai, Tianyou
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2021, 31 (14) : 6646 - 6667
  • [10] Modeling Driver Behavior using Adversarial Inverse Reinforcement Learning
    Sackmann, Moritz
    Bey, Henrik
    Hofmann, Ulrich
    Thielecke, Joern
    2022 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2022, : 1683 - 1690