State Aware Imitation Learning

被引:0
作者
Schroecker, Yannick [1 ]
Isbell, Charles [1 ]
机构
[1] Georgia Inst Technol, Coll Comp, Atlanta, GA 30332 USA
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017) | 2017年 / 30卷
关键词
AVERAGE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Imitation learning is the study of learning how to act given a set of demonstrations provided by a human expert. It is intuitively apparent that learning to take optimal actions is a simpler undertaking in situations that are similar to the ones shown by the teacher. However, imitation learning approaches do not tend to use this insight directly. In this paper, we introduce State Aware Imitation Learning (SAIL), an imitation learning algorithm that allows an agent to learn how to remain in states where it can confidently take the correct action and how to recover if it is lead astray. Key to this algorithm is a gradient learned using a temporal difference update rule which leads the agent to prefer states similar to the demonstrated states. We show that estimating a linear approximation of this gradient yields similar theoretical guarantees to online temporal difference learning approaches and empirically show that SAIL can effectively be used for imitation learning in continuous domains with non-linear function approximators used for both the policy representation and the gradient estimate.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Traceable machine learning real-time quality control based on patient data
    Zhou, Rui
    Wang, Wei
    Padoan, Andrea
    Wang, Zhe
    Feng, Xiang
    Han, Zewen
    Chen, Chao
    Liang, Yufang
    Wang, Tingting
    Cui, Weiqun
    Plebani, Mario
    Wang, Qingtao
    CLINICAL CHEMISTRY AND LABORATORY MEDICINE, 2022, 60 (12) : 1998 - 2004
  • [32] Triple-State Dissipative Soliton Laser Via Ultrafast Self-Parametric Amplification
    Peng, Junsong
    Zeng, Heping
    PHYSICAL REVIEW APPLIED, 2019, 11 (04)
  • [33] Prediction of effect of wind speed on air pollution level using machine learning technique
    Pandey, Anuradha
    Kumar, Vipin
    Rawat, Anubhav
    Rawal, Nekram
    CHEMICAL PRODUCT AND PROCESS MODELING, 2023, 18 (05): : 769 - 780
  • [34] A reinforcement learning approach for distance-based dynamic tolling in the stochastic network environment
    Zhu, Feng
    Ukkusuri, Satish V.
    JOURNAL OF ADVANCED TRANSPORTATION, 2015, 49 (02) : 247 - 266
  • [35] Proactive Resource Request for Disaster Response: A Deep Learning-Based Optimization Model
    Zhang, Hongzhe
    Zhao, Xiaohang
    Fang, Xiao
    Chen, Bintong
    INFORMATION SYSTEMS RESEARCH, 2024, 35 (02) : 1 - 23
  • [36] Comparison of Machine Learning Approaches with a General Linear Model To Predict Personal Exposure to Benzene
    Aquilina, Noel J.
    Delgado-Saborit, Juana Maria
    Bugelli, Stefano
    Ginies, Jason Padovani
    Harrison, Roy M.
    ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2018, 52 (19) : 11215 - 11222
  • [37] Development of a motion learning support system arranging and showing several coaches' motion data
    Yoshinaga, Toshihiro
    Soga, Masato
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS 19TH ANNUAL CONFERENCE, KES-2015, 2015, 60 : 1497 - 1505
  • [38] Multi-consensus decentralized primal-dual fixed point algorithm for distributed learning
    Tang, Kejie
    Liu, Weidong
    Mao, Xiaojun
    MACHINE LEARNING, 2024, 113 (07) : 4315 - 4357
  • [39] Broadband terahertz solid-state emitter driven by Yb:YAG thin-disk oscillator
    Barbiero, Gaia
    Wang, Haochuan
    Brons, Jonathan
    Chen, Bo-Han
    Pervak, Vladimir
    Fattahi, Hanieh
    JOURNAL OF PHYSICS B-ATOMIC MOLECULAR AND OPTICAL PHYSICS, 2020, 53 (12)
  • [40] Machine learning approaches can reduce environmental data requirements for regional yield potential simulation
    Xu, Hao
    Zhang, Xiaohu
    Ye, Zi
    Jiang, Li
    Qiu, Xiaolei
    Tian, Yongchao
    Zhu, Yan
    Cao, Weixing
    EUROPEAN JOURNAL OF AGRONOMY, 2021, 129