Objective learning from human demonstrations

被引：6

作者：

Lin, Jonathan Feng-Shun ^{[1
]}

Carreno-Medrano, Pamela ^{[2
]}

Parsapour, Mahsa ^{[3
]}

Sakr, Maram ^{[2
,4
]}

Kulic, Dana ^{[2
]}

机构：

[1] Univ Waterloo, Syst Design Engn, Waterloo, ON, Canada

[2] Monash Univ, Fac Engn, Clayton, Vic, Australia

[3] Univ Waterloo, Elect & Comp Engn, Waterloo, ON, Canada

[4] Univ British Columbia, Mech Engn, Vancouver, BC, Canada

来源：

ANNUAL REVIEWS IN CONTROL | 2021年 / 51卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

Reward learning; Inverse optimal control; Inverse reinforcement learning; INVERSE OPTIMAL-CONTROL; COST-FUNCTIONS; GENERATION; ROBOT;

D O I：

10.1016/j.arcontrol.2021.04.003

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Researchers in biomechanics, neuroscience, human-machine interaction and other fields are interested in inferring human intentions and objectives from observed actions. The problem of inferring objectives from observations has received extensive theoretical and methodological development from both the controls and machine learning communities. In this paper, we provide an integrating view of objective learning from human demonstration data. We differentiate algorithms based on the assumptions made about the objective function structure, how the similarity between the inferred objectives and the observed demonstrations is assessed, the assumptions made about the agent and environment model, and the properties of the observed human demonstrations. We review the application domains and validation approaches of existing works and identify the key open challenges and limitations. The paper concludes with an identification of promising directions for future work.

引用

页码：111 / 129

页数：19

共 50 条

[31] Learning Jumping Skills from Human with a Fast Reinforcement Learning Framework
Kuang, Yiqun
Wang, Shanren
Sun, Bibei
Hao, Jiasheng
Cheng, Hong
2018 IEEE 8TH ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (IEEE-CYBER), 2018, : 510 - 515
[32] Better-than-Demonstrator Imitation Learning via Automatically-Ranked Demonstrations
Brown, Daniel S.
Goo, Wonjoon
Niekum, Scott
CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
[33] Inverse Optimal Control for the identification of human objective: a preparatory study for physical Human-Robot Interaction
Franceschi, Paolo
Pedrocchi, Nicola
Beschi, Manuel
2022 IEEE 27TH INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION (ETFA), 2022,
[34] A Deep Reinforcement Learning Algorithm with Expert Demonstrations and Supervised Loss and its application in Autonomous Driving
Liu, Kai
Wan, Qin
Li, Yanjie
2018 37TH CHINESE CONTROL CONFERENCE (CCC), 2018, : 2944 - 2949
[35] Bavesian inverse reinforcement learning for demonstrations of an expert in multiple dynamics: Toward estimation of transferable reward
Yusukc N.
Sachiyo A.
Transactions of the Japanese Society for Artificial Intelligence, 2020, 35 (01)
[36] From inverse optimal control to inverse reinforcement learning: A historical review
Ab Azar, Nematollah
Shahmansoorian, Aref
Davoudi, Mohsen
ANNUAL REVIEWS IN CONTROL, 2020, 50 : 119 - 138
[37] Model-free inverse reinforcement learning with multi-intention, unlabeled, and overlapping demonstrations
Ariyan Bighashdel
Pavol Jancura
Gijs Dubbelman
Machine Learning, 2023, 112 : 2263 - 2296
[38] Model-free inverse reinforcement learning with multi-intention, unlabeled, and overlapping demonstrations
Bighashdel, Ariyan
Jancura, Pavol
Dubbelman, Gijs
MACHINE LEARNING, 2023, 112 (07) : 2263 - 2296
[39] Objective assessment of the human visual attentional state
Willeford, Kevin T.
Ciuffreda, Kenneth J.
Yadav, Naveen K.
Ludlam, Diana P.
DOCUMENTA OPHTHALMOLOGICA, 2013, 126 (01) : 29 - 44
[40] Interactive Learning from Policy-Dependent Human Feedback
MacGlashan, James
Ho, Mark K.
Loftin, Robert
Peng, Bei
Wang, Guan
Roberts, David L.
Taylor, Matthew E.
Littman, Michael L.
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70

← 1 2 3 4 5 →