Adversarial Option-Aware Hierarchical Imitation Learning

被引：0

作者：

Jing, Mingxuan ^{[1
]}

Huang, Wenbing ^{[1
]}

Sunk, Fuchun ^{[1
,2
]}

Ma, Xiaojian ^{[3
]}

Kong, Tao ^{[4
]}

Gan, Chuang ^{[5
]}

Li, Lei ^{[4
]}

机构：

[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing, Peoples R China

[2] THU Bosch JCML Ctr, Beijing, Peoples R China

[3] Univ Calif Los Angeles, Los Angeles, CA USA

[4] Bytedance AI Lab, Beijing, Peoples R China

[5] MIT IBM Watson AI Lab, Cambridge, MA USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139 | 2021年 / 139卷

基金：

中国国家自然科学基金; 中国博士后科学基金; 国家重点研发计划;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

It has been a challenge to learning skills for an agent from long-horizon unannotated demonstrations. Existing approaches like Hierarchical Imitation Learning(HIL) are prone to compounding errors or suboptimal solutions. In this paper, we propose Option-GAIL, a novel method to learn skills at long horizon. The key idea of Option-GAIL is modeling the task hierarchy by options and train the policy via generative adversarial optimization. In particular, we propose an Expectation-Maximization(EM)-style algorithm: an E-step that samples the options of expert conditioned on the current learned policy, and an M-step that updates the low- and high-level policies of agent simultaneously to minimize the newly proposed option-occupancy measurement between the expert and the agent. We theoretically prove the convergence of the proposed algorithm. Experiments show that Option-GAIL outperforms other counterparts consistently across a variety of tasks.

引用

页数：10

共 50 条

[1] Option-Aware Adversarial Inverse Reinforcement Learning for Robotic Control
Chen, Jiayu
Lan, Tian
Aggarwal, Vaneet
2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 5902 - 5908
[2] GO-DICE: Goal-Conditioned Option-Aware Offline Imitation Learning via Stationary Distribution Correction Estimation
Jain, Abhinav
Unhelkar, Vaibhav
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 11, 2024, : 12763 - 12772
[3] SHAIL: Safety-Aware Hierarchical Adversarial Imitation Learning for Autonomous Driving in Urban Environments
Jamgochian, Arec
Buehrle, Etienne
Fischer, Johannes
Kochenderfer, Mykel J.
2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 1530 - 1536
[4] POWER: Program Option-Aware Fuzzer for High Bug Detection Ability
Lee, Ahcheong
Ariq, Irfan
Kim, Yunho
Kim, Moonzoo
2022 IEEE 15TH INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION (ICST 2022), 2022, : 220 - 231
[5] Generative Adversarial Imitation Learning
Ho, Jonathan
Ermon, Stefano
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[6] Deep Adversarial Imitation Reinforcement Learning for QoS-Aware Cloud Job Scheduling
Huang, Yifeng
Cheng, Long
Xue, Lianting
Liu, Cong
Li, Yuancheng
Li, Jianbin
Ward, Tomas
IEEE SYSTEMS JOURNAL, 2022, 16 (03): : 4232 - 4242
[7] State Aware Imitation Learning
Schroecker, Yannick
Isbell, Charles
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[8] What Matters for Adversarial Imitation Learning?
Orsini, Manu
Raichuk, Anton
Hussenot, Leonard
Vincent, Damien
Dadashi, Robert
Girgin, Sertan
Geist, Matthieu
Bachem, Olivier
Pietquin, Olivier
Andrychowicz, Marcin
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
[9] Quantum generative adversarial imitation learning
Xiao, Tailong
Huang, Jingzheng
Li, Hongjing
Fan, Jianping
Zeng, Guihua
NEW JOURNAL OF PHYSICS, 2023, 25 (03):
[10] DiffAIL: Diffusion Adversarial Imitation Learning
Wang, Bingzheng
Wu, Guoqiang
Pang, Teng
Zhang, Yan
Yin, Yilong
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 15447 - 15455

← 1 2 3 4 5 →