State Aware Imitation Learning

被引：0

作者：

Schroecker, Yannick ^{[1
]}

Isbell, Charles ^{[1
]}

机构：

[1] Georgia Inst Technol, Coll Comp, Atlanta, GA 30332 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017) | 2017年 / 30卷

关键词：

AVERAGE;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Imitation learning is the study of learning how to act given a set of demonstrations provided by a human expert. It is intuitively apparent that learning to take optimal actions is a simpler undertaking in situations that are similar to the ones shown by the teacher. However, imitation learning approaches do not tend to use this insight directly. In this paper, we introduce State Aware Imitation Learning (SAIL), an imitation learning algorithm that allows an agent to learn how to remain in states where it can confidently take the correct action and how to recover if it is lead astray. Key to this algorithm is a gradient learned using a temporal difference update rule which leads the agent to prefer states similar to the demonstrated states. We show that estimating a linear approximation of this gradient yields similar theoretical guarantees to online temporal difference learning approaches and empirically show that SAIL can effectively be used for imitation learning in continuous domains with non-linear function approximators used for both the policy representation and the gradient estimate.

引用

页数：10

共 50 条

[31] Traceable machine learning real-time quality control based on patient data
Zhou, Rui
Wang, Wei
Padoan, Andrea
Wang, Zhe
Feng, Xiang
Han, Zewen
Chen, Chao
Liang, Yufang
Wang, Tingting
Cui, Weiqun
Plebani, Mario
Wang, Qingtao
CLINICAL CHEMISTRY AND LABORATORY MEDICINE, 2022, 60 (12) : 1998 - 2004
[32] Triple-State Dissipative Soliton Laser Via Ultrafast Self-Parametric Amplification
Peng, Junsong
Zeng, Heping
PHYSICAL REVIEW APPLIED, 2019, 11 (04)
[33] Prediction of effect of wind speed on air pollution level using machine learning technique
Pandey, Anuradha
Kumar, Vipin
Rawat, Anubhav
Rawal, Nekram
CHEMICAL PRODUCT AND PROCESS MODELING, 2023, 18 (05): : 769 - 780
[34] A reinforcement learning approach for distance-based dynamic tolling in the stochastic network environment
Zhu, Feng
Ukkusuri, Satish V.
JOURNAL OF ADVANCED TRANSPORTATION, 2015, 49 (02) : 247 - 266
[35] Proactive Resource Request for Disaster Response: A Deep Learning-Based Optimization Model
Zhang, Hongzhe
Zhao, Xiaohang
Fang, Xiao
Chen, Bintong
INFORMATION SYSTEMS RESEARCH, 2024, 35 (02) : 1 - 23
[36] Comparison of Machine Learning Approaches with a General Linear Model To Predict Personal Exposure to Benzene
Aquilina, Noel J.
Delgado-Saborit, Juana Maria
Bugelli, Stefano
Ginies, Jason Padovani
Harrison, Roy M.
ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2018, 52 (19) : 11215 - 11222
[37] Development of a motion learning support system arranging and showing several coaches' motion data
Yoshinaga, Toshihiro
Soga, Masato
KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS 19TH ANNUAL CONFERENCE, KES-2015, 2015, 60 : 1497 - 1505
[38] Multi-consensus decentralized primal-dual fixed point algorithm for distributed learning
Tang, Kejie
Liu, Weidong
Mao, Xiaojun
MACHINE LEARNING, 2024, 113 (07) : 4315 - 4357
[39] Broadband terahertz solid-state emitter driven by Yb:YAG thin-disk oscillator
Barbiero, Gaia
Wang, Haochuan
Brons, Jonathan
Chen, Bo-Han
Pervak, Vladimir
Fattahi, Hanieh
JOURNAL OF PHYSICS B-ATOMIC MOLECULAR AND OPTICAL PHYSICS, 2020, 53 (12)
[40] Machine learning approaches can reduce environmental data requirements for regional yield potential simulation
Xu, Hao
Zhang, Xiaohu
Ye, Zi
Jiang, Li
Qiu, Xiaolei
Tian, Yongchao
Zhu, Yan
Cao, Weixing
EUROPEAN JOURNAL OF AGRONOMY, 2021, 129

← 1 2 3 4 5 →