Bootstrapping Adaptive Human-Machine Interfaces with Offline Reinforcement Learning

被引：0

作者：

Gao, Jensen ^{[1
,2
]}

Reddy, Siddharth ^{[2
]}

Berseth, Glen ^{[2
,3
,4
]}

Dragan, Anca D. ^{[2
]}

Levine, Sergey ^{[2
]}

机构：

[1] Stanford Univ, Stanford, CA 94305 USA

[2] Univ Calif Berkeley, Berkeley, CA 94720 USA

[3] Univ Montreal, Montreal, PQ, Canada

[4] MILA, Montreal, PQ, Canada

来源：

2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) | 2023年

关键词：

D O I：

10.1109/IROS55552.2023.10341779

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Adaptive interfaces can help users perform sequential decision-making tasks like robotic teleoperation given noisy, high-dimensional command signals (e.g., from a brain-computer interface). Recent advances in human-in-the-loop machine learning enable such systems to improve by interacting with users, but tend to be limited by the amount of data that they can collect from individual users in practice. In this paper, we propose a reinforcement learning algorithm to address this by training an interface to map raw command signals to actions using a combination of offline pre-training and online fine-tuning. To address the challenges posed by noisy command signals and sparse rewards, we develop a novel method for representing and inferring the user's long-term intent for a given trajectory. We primarily evaluate our method's ability to assist users who can only communicate through noisy, high-dimensional input channels through a user study in which 12 participants performed a simulated navigation task by using their eye gaze to modulate a 128-dimensional command signal from their webcam. The results show that our method enables successful goal navigation more often than a baseline directional interface, by learning to denoise user commands signals and provide shared autonomy assistance. We further evaluate on a simulated Sawyer pushing task with eye gaze control, and the Lunar Lander game with simulated user commands, and find that our method improves over baseline interfaces in these domains as well. Extensive ablation experiments with simulated user commands empirically motivate each component of our method.

引用

页码：7523 / 7530

页数：8

共 50 条

[31] External human-machine interfaces: Gimmick or necessity?
de Winter, Joost
Dodou, Dimitra
TRANSPORTATION RESEARCH INTERDISCIPLINARY PERSPECTIVES, 2022, 15
[32] Adapting Human-Machine Interfaces to User Performance
Danziger, Zachary
Fishbach, Alon
Mussa-Ivaldi, Ferdinando A.
2008 30TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-8, 2008, : 4486 - +
[33] Human-Machine Interfaces for Robotic System Control
Roibu, Horatiu
Popescu, Dorin
Abagiu, Marian-Marcel
Bizdoaca, Nicu-George
2018 INTERNATIONAL CONFERENCE ON APPLIED AND THEORETICAL ELECTRICITY (ICATE), 2018,
[34] Modeling and canceling tremor in human-machine interfaces
Riviere, CN
Thakor, NV
IEEE ENGINEERING IN MEDICINE AND BIOLOGY MAGAZINE, 1996, 15 (03): : 29 - 36
[35] Exploration of Mandibular Inputs for Human-Machine Interfaces
Yaslam, Abdulaziz
Feron, Eric
Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics, 2022, 2022-October : 46 - 51
[36] A Formal Machine-Learning Approach to Generating Human-Machine Interfaces From Task Models
Li, Meng
Wei, Jiajun
Zheng, Xi
Bolton, Matthew L.
IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, 2017, 47 (06) : 822 - 833
[37] Exploring Human-Machine Interfaces for Teleoperation of Excavator
Lee, Jin Sol
Ham, Youngjib
CONSTRUCTION RESEARCH CONGRESS 2022: COMPUTER APPLICATIONS, AUTOMATION, AND DATA ANALYTICS, 2022, : 757 - 765
[38] A Human-Machine Agent Based on Active Reinforcement Learning for Target Classification in Wargame
Chen, Li
Zhang, Yulong
Feng, Yanghe
Zhang, Longfei
Liu, Zhong
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (07) : 9858 - 9870
[39] Deep reinforcement learning for pedestrian collision avoidance and human-machine cooperative driving
Li, Junxiang
Yao, Liang
Xu, Xin
Cheng, Bang
Ren, Junkai
INFORMATION SCIENCES, 2020, 532 : 110 - 124
[40] Adaptive Policy Learning for Offline-to-Online Reinforcement Learning
Zheng, Han
Luo, Xufang
Wei, Pengfei
Song, Xuan
Li, Dongsheng
Jiang, Jing
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 11372 - 11380

← 1 2 3 4 5 →