Option-Aware Adversarial Inverse Reinforcement Learning for Robotic Control

被引:2
|
作者
Chen, Jiayu [1 ]
Lan, Tian [3 ]
Aggarwal, Vaneet [1 ,2 ]
机构
[1] Purdue Univ, Sch Ind Engn, W Lafayette, IN 47907 USA
[2] KAUST, CS Dept, Thuwal, Saudi Arabia
[3] George Washington Univ, Dept Elect & Comp Engn, Washington, DC 20052 USA
关键词
NEURAL-NETWORKS;
D O I
10.1109/ICRA48891.2023.10160374
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Hierarchical Imitation Learning (HIL) has been proposed to recover highly-complex behaviors in long-horizon tasks from expert demonstrations by modeling the task hierarchy with the option framework. Existing methods either overlook the causal relationship between the subtask and its corresponding policy or cannot learn the policy in an end-to-end fashion, which leads to suboptimality. In this work, we develop a novel HIL algorithm based on Adversarial Inverse Reinforcement Learning and adapt it with the Expectation-Maximization algorithm in order to directly recover a hierarchical policy from the unannotated demonstrations. Further, we introduce a directed information term to the objective function to enhance the causality and propose a Variational Autoencoder framework for learning with our objectives in an end-to-end fashion. Theoretical justifications and evaluations on challenging robotic control tasks are provided to show the superiority of our algorithm. The codes are available at https://github.com/LucasCJYSDL/HierAIRL.
引用
收藏
页码:5902 / 5908
页数:7
相关论文
共 50 条
  • [31] Inverse Reinforcement Learning: A Control Lyapunov Approach
    Tesfazgi, Samuel
    Lederer, Armin
    Hirche, Sandra
    2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 3627 - 3632
  • [32] Learning Continuous Control Actions for Robotic Grasping with Reinforcement Learning
    Shahid, Asad Ali
    Roveda, Loris
    Piga, Dario
    Braghin, Francesco
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 4066 - 4072
  • [33] GO-DICE: Goal-Conditioned Option-Aware Offline Imitation Learning via Stationary Distribution Correction Estimation
    Jain, Abhinav
    Unhelkar, Vaibhav
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 11, 2024, : 12763 - 12772
  • [34] Transaction-aware inverse reinforcement learning for trading in stock markets
    Sun, Qizhou
    Gong, Xueyuan
    Si, Yain-Whar
    APPLIED INTELLIGENCE, 2023, 53 (23) : 28186 - 28206
  • [35] Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints
    Tschiatschek, Sebastian
    Ghosh, Ahana
    Haug, Luis
    Devidze, Rati
    Singla, Adish
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [36] Transaction-aware inverse reinforcement learning for trading in stock markets
    Qizhou Sun
    Xueyuan Gong
    Yain-Whar Si
    Applied Intelligence, 2023, 53 : 28186 - 28206
  • [37] Objective-aware Traffic Simulation via Inverse Reinforcement Learning
    Zheng, Guanjie
    Liu, Hanyang
    Xu, Kai
    Li, Zhenhui
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 3771 - 3777
  • [38] Learning inverse kinematics and dynamics of a robotic manipulator using generative adversarial networks
    Ren, Hailin
    Ben-Tzvi, Pinhas
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2020, 124
  • [39] Meta-Adversarial Inverse Reinforcement Learning for Decision-making Tasks
    Wang, Pin
    Li, Hanhan
    Chan, Ching-Yao
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 12632 - 12638
  • [40] Decision Making for Autonomous Driving via Augmented Adversarial Inverse Reinforcement Learning
    Wang, Pin
    Liu, Dapeng
    Chen, Jiayu
    Li, Hanhan
    Chan, Ching-Yao
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 1036 - 1042