Learning from Richer Human Guidance: Augmenting Comparison-Based Learning with Feature Queries

被引：35

作者：

Basu, Chandrayee ^{[1
]}

Singhal, Mukesh ^{[1
]}

Dragan, Anca D. ^{[2
]}

机构：

[1] UC Merced, Merced, CA 95343 USA

[2] Univ Calif Berkeley, Berkeley, CA USA

来源：

HRI '18: PROCEEDINGS OF THE 2018 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION | 2018年

基金：

美国国家科学基金会;

关键词：

reward learning; comparison-based learning; learning from human guidance; driving style;

D O I：

10.1145/3171221.3171284

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

We focus on learning the desired objective function for a robot. Although trajectory demonstrations can be very informative of the desired objective, they can also be difficult for users to provide. Answers to comparison queries, asking which of two trajectories is preferable, are much easier for users, and have emerged as an effective alternative. Unfortunately, comparisons are far less informative. We propose that there is much richer information that users can easily provide and that robots ought to leverage. We focus on augmenting comparisons with feature queries, and introduce a unified formalism for treating all answers as observations about the true desired reward. We derive an active query selection algorithm, and test these queries in simulation and on real users. We find that richer, feature-augmented queries can extract more information faster, leading to robots that better match user preferences in their behavior.

引用

页码：132 / 140

页数：9

共 22 条

[1] Preference-based learning to rank [J].

Ailon, Nir ;

Mohri, Mehryar .

MACHINE LEARNING, 2010, 80 (2-3) :189-211

[2] Keyframe-based Learning from Demonstration Method and Evaluation [J].

Akgun, Baris ;

Cakmak, Maya ;

Jiang, Karl ;

Thomaz, Andrea L. .

INTERNATIONAL JOURNAL OF SOCIAL ROBOTICS, 2012, 4 (04) :343-355

[3]

Akgun B, 2012, ACMIEEE INT CONF HUM, P391

[4]

Akrour Riad, 2012, Machine Learning and Knowledge Discovery in Databases. Proceedings of the European Conference (ECML PKDD 2012), P116, DOI 10.1007/978-3-642-33486-3_8

[5]

[Anonymous], ARXIV170603741

[6]

[Anonymous], 2006, P 23 INT C MACH LEAR, DOI [10.1145/1143844.1143936, DOI 10.1145/1143844.1143936]

[7]

[Anonymous], 2004, P TWENTYFIRST INT C

[8]

[Anonymous], Active learning literature survey

[9]

[Anonymous], 2008, AAAI

[10]

[Anonymous], 2016, ABS160606565 CORR

← 1 2 3 →