Objective learning from human demonstrations

被引：6

作者：

Lin, Jonathan Feng-Shun ^{[1
]}

Carreno-Medrano, Pamela ^{[2
]}

Parsapour, Mahsa ^{[3
]}

Sakr, Maram ^{[2
,4
]}

Kulic, Dana ^{[2
]}

机构：

[1] Univ Waterloo, Syst Design Engn, Waterloo, ON, Canada

[2] Monash Univ, Fac Engn, Clayton, Vic, Australia

[3] Univ Waterloo, Elect & Comp Engn, Waterloo, ON, Canada

[4] Univ British Columbia, Mech Engn, Vancouver, BC, Canada

来源：

ANNUAL REVIEWS IN CONTROL | 2021年 / 51卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

Reward learning; Inverse optimal control; Inverse reinforcement learning; INVERSE OPTIMAL-CONTROL; COST-FUNCTIONS; GENERATION; ROBOT;

D O I：

10.1016/j.arcontrol.2021.04.003

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Researchers in biomechanics, neuroscience, human-machine interaction and other fields are interested in inferring human intentions and objectives from observed actions. The problem of inferring objectives from observations has received extensive theoretical and methodological development from both the controls and machine learning communities. In this paper, we provide an integrating view of objective learning from human demonstration data. We differentiate algorithms based on the assumptions made about the objective function structure, how the similarity between the inferred objectives and the observed demonstrations is assessed, the assumptions made about the agent and environment model, and the properties of the observed human demonstrations. We review the application domains and validation approaches of existing works and identify the key open challenges and limitations. The paper concludes with an identification of promising directions for future work.

引用

页码：111 / 129

页数：19

共 50 条

[21] Combination of learning from non-optimal demonstrations and feedbacks using inverse reinforcement learning and Bayesian policy improvement
Ezzeddine, Ali
Mourad, Nafee
Araabi, Babak Nadjar
Ahmadabadi, Majid Nili
EXPERT SYSTEMS WITH APPLICATIONS, 2018, 112 : 331 - 341
[22] An Unified Approach to Inverse Reinforcement Learning by Oppositive Demonstrations
Hwang, Kao-Shing
Jiang, Wei-Cheng
Tseng, Yi-Chia
PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), 2016, : 1664 - 1668
[23] Label-Free Adaptive Gaussian Sample Consensus Framework for Learning From Perfect and Imperfect Demonstrations
Hu, Yi
Samadikhoshkho, Zahra
Jin, Jun
Tavakoli, Mahdi
IEEE TRANSACTIONS ON MEDICAL ROBOTICS AND BIONICS, 2024, 6 (03): : 1093 - 1103
[24] Trajectory Learning by Therapists' Demonstrations for an Upper Limb Rehabilitation Exoskeleton
Luciani, Beatrice
Roveda, Loris
Braghin, Francesco
Pedrocchi, Alessandra
Gandolla, Marta
IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (08) : 4561 - 4568
[25] Batch Active Learning of Reward Functions from Human Preferences
Biyik, Erdem
Anari, Nima
Sadigh, Dorsa
ACM TRANSACTIONS ON HUMAN-ROBOT INTERACTION, 2024, 13 (02)
[26] A Novel Teacher-Assistance-Based Method to Detect and Handle Bad Training Demonstrations in Learning From Demonstration
Li, Qin
Wang, Yong
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2022, 14 (03) : 948 - 956
[27] Preliminary experiments in motion programming of humanoid robot by human demonstrations
Konno, A
Yoshiike, T
Nagashima, K
Inaba, M
Inoue, H
JSME INTERNATIONAL JOURNAL SERIES C-MECHANICAL SYSTEMS MACHINE ELEMENTS AND MANUFACTURING, 2000, 43 (02) : 401 - 407
[28] Learning From Human Directional Corrections
Jin, Wanxin
Murphey, Todd D.
Lu, Zehui
Mou, Shaoshuai
IEEE TRANSACTIONS ON ROBOTICS, 2023, 39 (01) : 625 - 644
[29] Individual Human Behavior Identification Using an Inverse Reinforcement Learning Method
Inga, Jairo
Koepf, Florian
Flad, Michael
Hohmann, Soeren
2017 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2017, : 99 - 104
[30] Learning from Approximate Human Decisions by a Robot
Jayawardena, Chandimal
Watanabe, Keigo
Izumi, Kiyotaka
JOURNAL OF ROBOTICS AND MECHATRONICS, 2007, 19 (01) : 68 - 76

← 1 2 3 4 5 →