Adversarial Option-Aware Hierarchical Imitation Learning

被引:0
|
作者
Jing, Mingxuan [1 ]
Huang, Wenbing [1 ]
Sunk, Fuchun [1 ,2 ]
Ma, Xiaojian [3 ]
Kong, Tao [4 ]
Gan, Chuang [5 ]
Li, Lei [4 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing, Peoples R China
[2] THU Bosch JCML Ctr, Beijing, Peoples R China
[3] Univ Calif Los Angeles, Los Angeles, CA USA
[4] Bytedance AI Lab, Beijing, Peoples R China
[5] MIT IBM Watson AI Lab, Cambridge, MA USA
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139 | 2021年 / 139卷
基金
中国国家自然科学基金; 中国博士后科学基金; 国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It has been a challenge to learning skills for an agent from long-horizon unannotated demonstrations. Existing approaches like Hierarchical Imitation Learning(HIL) are prone to compounding errors or suboptimal solutions. In this paper, we propose Option-GAIL, a novel method to learn skills at long horizon. The key idea of Option-GAIL is modeling the task hierarchy by options and train the policy via generative adversarial optimization. In particular, we propose an Expectation-Maximization(EM)-style algorithm: an E-step that samples the options of expert conditioned on the current learned policy, and an M-step that updates the low- and high-level policies of agent simultaneously to minimize the newly proposed option-occupancy measurement between the expert and the agent. We theoretically prove the convergence of the proposed algorithm. Experiments show that Option-GAIL outperforms other counterparts consistently across a variety of tasks.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Option-Aware Adversarial Inverse Reinforcement Learning for Robotic Control
    Chen, Jiayu
    Lan, Tian
    Aggarwal, Vaneet
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 5902 - 5908
  • [2] GO-DICE: Goal-Conditioned Option-Aware Offline Imitation Learning via Stationary Distribution Correction Estimation
    Jain, Abhinav
    Unhelkar, Vaibhav
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 11, 2024, : 12763 - 12772
  • [3] SHAIL: Safety-Aware Hierarchical Adversarial Imitation Learning for Autonomous Driving in Urban Environments
    Jamgochian, Arec
    Buehrle, Etienne
    Fischer, Johannes
    Kochenderfer, Mykel J.
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 1530 - 1536
  • [4] POWER: Program Option-Aware Fuzzer for High Bug Detection Ability
    Lee, Ahcheong
    Ariq, Irfan
    Kim, Yunho
    Kim, Moonzoo
    2022 IEEE 15TH INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION (ICST 2022), 2022, : 220 - 231
  • [5] Generative Adversarial Imitation Learning
    Ho, Jonathan
    Ermon, Stefano
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [6] Deep Adversarial Imitation Reinforcement Learning for QoS-Aware Cloud Job Scheduling
    Huang, Yifeng
    Cheng, Long
    Xue, Lianting
    Liu, Cong
    Li, Yuancheng
    Li, Jianbin
    Ward, Tomas
    IEEE SYSTEMS JOURNAL, 2022, 16 (03): : 4232 - 4242
  • [7] State Aware Imitation Learning
    Schroecker, Yannick
    Isbell, Charles
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [8] What Matters for Adversarial Imitation Learning?
    Orsini, Manu
    Raichuk, Anton
    Hussenot, Leonard
    Vincent, Damien
    Dadashi, Robert
    Girgin, Sertan
    Geist, Matthieu
    Bachem, Olivier
    Pietquin, Olivier
    Andrychowicz, Marcin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [9] Quantum generative adversarial imitation learning
    Xiao, Tailong
    Huang, Jingzheng
    Li, Hongjing
    Fan, Jianping
    Zeng, Guihua
    NEW JOURNAL OF PHYSICS, 2023, 25 (03):
  • [10] DiffAIL: Diffusion Adversarial Imitation Learning
    Wang, Bingzheng
    Wu, Guoqiang
    Pang, Teng
    Zhang, Yan
    Yin, Yilong
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 15447 - 15455