Machine Teaching for Human Inverse Reinforcement Learning

被引:9
作者
Lee, Michael S. [1 ]
Admoni, Henny [1 ]
Simmons, Reid [1 ]
机构
[1] Carnegie Mellon Univ, Robot Inst, Pittsburgh, PA 15213 USA
关键词
inverse reinforcement learning; learning from demonstration; scaffolding; policy summarization; machine teaching; RELIABILITY; PREFERENCES;
D O I
10.3389/frobt.2021.693050
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
As robots continue to acquire useful skills, their ability to teach their expertise will provide humans the two-fold benefit of learning from robots and collaborating fluently with them. For example, robot tutors could teach handwriting to individual students and delivery robots could convey their navigation conventions to better coordinate with nearby human workers. Because humans naturally communicate their behaviors through selective demonstrations, and comprehend others' through reasoning that resembles inverse reinforcement learning (IRL), we propose a method of teaching humans based on demonstrations that are informative for IRL. But unlike prior work that optimizes solely for IRL, this paper incorporates various human teaching strategies (e.g. scaffolding, simplicity, pattern discovery, and testing) to better accommodate human learners. We assess our method with user studies and find that our measure of test difficulty corresponds well with human performance and confidence, and also find that favoring simplicity and pattern discovery increases human performance on difficult tests. However, we did not find a strong effect for our method of scaffolding, revealing shortcomings that indicate clear directions for future work.
引用
收藏
页数:14
相关论文
共 44 条
[1]  
Abel D, 2019, ICLR WORKSH REPR MAC ICLR WORKSH REPR MAC
[2]  
Altman DG., 1991, PRACTICAL STAT MED R, V1st ed, P404
[3]  
Amir D, 2018, PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS (AAMAS' 18), P1168
[4]   Summarizing agent strategies [J].
Amir, Ofra ;
Doshi-Velez, Finale ;
Sarne, David .
AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2019, 33 (05) :628-644
[5]  
Amitai Y., 2021, ARXIV PREPRINT ARXIV
[6]  
[Anonymous], 2004, PROCEEDINGS
[7]  
Baker C., 2011, P ANN M COGNITIVE SC, V33
[8]   Action understanding as inverse planning [J].
Baker, Chris L. ;
Saxe, Rebecca ;
Tenenbaum, Joshua B. .
COGNITION, 2009, 113 (03) :329-349
[9]  
Brown DS, 2019, AAAI CONF ARTIF INTE, P7749
[10]  
Brown DS, 2018, AAAI CONF ARTIF INTE, P2754