State Aware Imitation Learning

被引：0

作者：

Schroecker, Yannick ^{[1
]}

Isbell, Charles ^{[1
]}

机构：

[1] Georgia Inst Technol, Coll Comp, Atlanta, GA 30332 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017) | 2017年 / 30卷

关键词：

AVERAGE;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Imitation learning is the study of learning how to act given a set of demonstrations provided by a human expert. It is intuitively apparent that learning to take optimal actions is a simpler undertaking in situations that are similar to the ones shown by the teacher. However, imitation learning approaches do not tend to use this insight directly. In this paper, we introduce State Aware Imitation Learning (SAIL), an imitation learning algorithm that allows an agent to learn how to remain in states where it can confidently take the correct action and how to recover if it is lead astray. Key to this algorithm is a gradient learned using a temporal difference update rule which leads the agent to prefer states similar to the demonstrated states. We show that estimating a linear approximation of this gradient yields similar theoretical guarantees to online temporal difference learning approaches and empirically show that SAIL can effectively be used for imitation learning in continuous domains with non-linear function approximators used for both the policy representation and the gradient estimate.

引用

页数：10

共 50 条

[41] Interactive Encoding and Decoding for One Way Learning: Near Lossless Recovery With Side Information at the Decoder
Yang, En-Hui
He, Da-Ke
IEEE TRANSACTIONS ON INFORMATION THEORY, 2010, 56 (04) : 1808 - 1824
[42] Acquisition by robots of danger-avoidance behaviors using probability-based reinforcement learning
Takeyama, Daiki
Kanoh, Masayoshi
Matsui, Tohgoroh
Nakamura, Tsuyoshi
2015 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2015), 2015,
[43] Energy-efficient distributed state estimation via event-triggered consensus exponential families
Battistelli, Giorgio
Chisci, Luigi
Selvi, Daniela
2016 AMERICAN CONTROL CONFERENCE (ACC), 2016, : 6387 - 6392
[44] Cross-Learning With Panel Data Modeling for Stacking and Forecast Time Series Employment in Europe
Lovaglio, Pietro Giorgio
JOURNAL OF FORECASTING, 2025, 44 (02) : 753 - 780
[45] All-Solid-State Post-Compression of Low-Energy Pulses at High Repetition Rate
Marciulionyte, Vaida
Banys, Jonas
Vengelis, Julius
Tamosauskas, Gintaras
Dubietis, Audrius
PHOTONICS, 2024, 11 (04)
[46] Method for Assessing Heat Loss in A District Heating Network with A Focus on the State of Insulation and Actual Demand for Useful Energy
Chicherin, Stanislav
Masatin, Vladislav
Siirde, Andres
Volkova, Anna
ENERGIES, 2020, 13 (17)
[47] Progress on Yb-Doped All-Solid-State Femtosecond Laser Amplifier with High Repetition Rate
Bai Chuan
Tian Wenlong
Wang Geyang
Zhen Li
Xu Rui
Zhang Dacheng
Wang Zhaohua
Zhu Jiangfeng
Wei Zhiyi
CHINESE JOURNAL OF LASERS-ZHONGGUO JIGUANG, 2021, 48 (05):
[48] Learning-based traffic signal control algorithms with neighborhood information sharing: An application for sustainable mobility
Aziz, H. M. Abdul
Zhu, Feng
Ukkusuri, Satish V.
JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS, 2018, 22 (01) : 40 - 52
[49] Effective wind power prediction using novel deep learning network: Stacked independently recurrent autoencoder
Wang, Lin
Tao, Rui
Hu, Huanling
Zeng, Yu-Rong
RENEWABLE ENERGY, 2021, 164 : 642 - 655
[50] A Novel In-Line Polymer Melt Viscosity Sensing System of Integrated Soft Sensor and Machine Learning
Wang, Zhi-Hao
Li, Yi-Ting
Wen, Fu-Chi
IEEE SENSORS JOURNAL, 2023, 23 (11) : 12181 - 12189

← 1 2 3 4 5 →