Improving reinforcement learning with human assistance: an argument for human subject studies with HIPPO Gym

被引:1
作者
Taylor, Matthew E. [1 ,3 ]
Nissen, Nicholas [1 ]
Wang, Yuan [1 ]
Navidi, Neda [2 ]
机构
[1] Univ Alberta, Dept Comp Sci, Edmonton, AB, Canada
[2] Ecole Technol Super ETS, Montreal, PQ, Canada
[3] Univ Alberta, Alberta Machine Intelligence Inst AMII, Edmonton, AB, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Reinforcement learning; Human-in-the-loop AI; Human-AI interaction; Human subject studies; Crowdsourcing; Open-source software;
D O I
10.1007/s00521-021-06375-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning (RL) is a popular machine learning paradigm for game playing, robotics control, and other sequential decision tasks. However, RL agents often have long learning times with high data requirements because they begin by acting randomly. In order to better learn in complex tasks, we argue that an external teacher can often significantly help the RL agent learn. OpenAI Gym is a common framework for RL research, including a large number of standard environments and agents, making RL research significantly more accessible. This article introduces our new open-source RL framework, the Human Input Parsing Platform for Openai Gym (HIPPO Gym), and the design decisions that went into its creation. The goal of this platform is to facilitate human-RL research, making human-in-the-loop RL more accessible, including learning from demonstrations, learning from feedback, or curriculum learning. In addition, all experiments can be conducted over the internet without any additional software needed on the client's computer, making experiments at scale significantly easier.
引用
收藏
页码:23429 / 23439
页数:11
相关论文
共 58 条
[1]   Guidelines for Human-AI Interaction [J].
Amershi, Saleema ;
Weld, Dan ;
Vorvoreanu, Mihaela ;
Fourney, Adam ;
Nushi, Besmira ;
Collisson, Penny ;
Suh, Jina ;
Iqbal, Shamsi ;
Bennett, Paul N. ;
Inkpen, Kori ;
Teevan, Jaime ;
Kikin-Gil, Ruth ;
Horvitz, Eric .
CHI 2019: PROCEEDINGS OF THE 2019 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 2019,
[2]  
Amir Ofra, 2016, P 25 INT JOINT C ART
[3]  
[Anonymous], 2016, DEEPMIND AI REDUCE
[4]  
[Anonymous], 2016, P 25 INT JOINT C ART
[5]  
[Anonymous], 2018, OPEN AI 5
[6]   A survey of robot learning from demonstration [J].
Argall, Brenna D. ;
Chernova, Sonia ;
Veloso, Manuela ;
Browning, Brett .
ROBOTICS AND AUTONOMOUS SYSTEMS, 2009, 57 (05) :469-483
[7]   The Arcade Learning Environment: An Evaluation Platform for General Agents [J].
Bellemare, Marc G. ;
Naddaf, Yavar ;
Veness, Joel ;
Bowling, Michael .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2013, 47 :253-279
[8]  
Bowling, 2020, P AD LEARN AG WORKSH
[9]  
Cederborg T, 2015, PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), P3366
[10]   Confidence-Based Multi-Robot Learning from Demonstration [J].
Chernova, Sonia ;
Veloso, Manuela .
INTERNATIONAL JOURNAL OF SOCIAL ROBOTICS, 2010, 2 (02) :195-215