State Aware Imitation Learning

被引:0
作者
Schroecker, Yannick [1 ]
Isbell, Charles [1 ]
机构
[1] Georgia Inst Technol, Coll Comp, Atlanta, GA 30332 USA
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017) | 2017年 / 30卷
关键词
AVERAGE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Imitation learning is the study of learning how to act given a set of demonstrations provided by a human expert. It is intuitively apparent that learning to take optimal actions is a simpler undertaking in situations that are similar to the ones shown by the teacher. However, imitation learning approaches do not tend to use this insight directly. In this paper, we introduce State Aware Imitation Learning (SAIL), an imitation learning algorithm that allows an agent to learn how to remain in states where it can confidently take the correct action and how to recover if it is lead astray. Key to this algorithm is a gradient learned using a temporal difference update rule which leads the agent to prefer states similar to the demonstrated states. We show that estimating a linear approximation of this gradient yields similar theoretical guarantees to online temporal difference learning approaches and empirically show that SAIL can effectively be used for imitation learning in continuous domains with non-linear function approximators used for both the policy representation and the gradient estimate.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Interactive Encoding and Decoding for One Way Learning: Near Lossless Recovery With Side Information at the Decoder
    Yang, En-Hui
    He, Da-Ke
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2010, 56 (04) : 1808 - 1824
  • [42] Acquisition by robots of danger-avoidance behaviors using probability-based reinforcement learning
    Takeyama, Daiki
    Kanoh, Masayoshi
    Matsui, Tohgoroh
    Nakamura, Tsuyoshi
    2015 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2015), 2015,
  • [43] Energy-efficient distributed state estimation via event-triggered consensus exponential families
    Battistelli, Giorgio
    Chisci, Luigi
    Selvi, Daniela
    2016 AMERICAN CONTROL CONFERENCE (ACC), 2016, : 6387 - 6392
  • [44] Cross-Learning With Panel Data Modeling for Stacking and Forecast Time Series Employment in Europe
    Lovaglio, Pietro Giorgio
    JOURNAL OF FORECASTING, 2025, 44 (02) : 753 - 780
  • [45] All-Solid-State Post-Compression of Low-Energy Pulses at High Repetition Rate
    Marciulionyte, Vaida
    Banys, Jonas
    Vengelis, Julius
    Tamosauskas, Gintaras
    Dubietis, Audrius
    PHOTONICS, 2024, 11 (04)
  • [46] Method for Assessing Heat Loss in A District Heating Network with A Focus on the State of Insulation and Actual Demand for Useful Energy
    Chicherin, Stanislav
    Masatin, Vladislav
    Siirde, Andres
    Volkova, Anna
    ENERGIES, 2020, 13 (17)
  • [47] Progress on Yb-Doped All-Solid-State Femtosecond Laser Amplifier with High Repetition Rate
    Bai Chuan
    Tian Wenlong
    Wang Geyang
    Zhen Li
    Xu Rui
    Zhang Dacheng
    Wang Zhaohua
    Zhu Jiangfeng
    Wei Zhiyi
    CHINESE JOURNAL OF LASERS-ZHONGGUO JIGUANG, 2021, 48 (05):
  • [48] Learning-based traffic signal control algorithms with neighborhood information sharing: An application for sustainable mobility
    Aziz, H. M. Abdul
    Zhu, Feng
    Ukkusuri, Satish V.
    JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS, 2018, 22 (01) : 40 - 52
  • [49] Effective wind power prediction using novel deep learning network: Stacked independently recurrent autoencoder
    Wang, Lin
    Tao, Rui
    Hu, Huanling
    Zeng, Yu-Rong
    RENEWABLE ENERGY, 2021, 164 : 642 - 655
  • [50] A Novel In-Line Polymer Melt Viscosity Sensing System of Integrated Soft Sensor and Machine Learning
    Wang, Zhi-Hao
    Li, Yi-Ting
    Wen, Fu-Chi
    IEEE SENSORS JOURNAL, 2023, 23 (11) : 12181 - 12189