A Survey of Multimodal Perception Methods for Human-Robot Interaction in Social Environments

被引:1
作者
Duncan, John A. [1 ]
Alambeigi, Farshid [1 ]
Pryor, Mitchell W. [1 ]
机构
[1] Univ Texas Austin, Austin, TX 78712 USA
关键词
Human-robot interaction; multimodal perception; situated interaction; social robotics; human social environments; USER ENGAGEMENT; SOUND SOURCES; LOCALIZATION; RECOGNITION; DESIGN; FUSION; SYSTEM; FRAMEWORK; NETWORK; DATASET;
D O I
10.1145/3657030
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Human-robot interaction (HRI) in human social environments (HSEs) poses unique challenges for robot perception systems, which must combine asynchronous, heterogeneous data streams in real time. Multimodal perception systems are well-suited for HRI in HSEs and can provide more rich, robust interaction for robots operating among humans. In this article, we provide an overview of multimodal perception systems being used in HSEs, which is intended to be an introduction to the topic and summary of relevant trends, techniques, resources, challenges, and terminology. We surveyed 15 peer-reviewed robotics and HRI publications over the past 10+ years, providing details about the data acquisition, processing, and fusion techniques used in 65 multimodal perception systems across various HRI domains. Our survey provides information about hardware, software, datasets, and methods currently available for HRI perception research, as well as how these perception systems are being applied in HSEs. Based on the survey, we summarize trends, challenges, and limitations of multimodal human perception systems for robots, then identify resources for researchers and developers and propose future research areas to advance the field.
引用
收藏
页数:50
相关论文
共 188 条
  • [1] Abioye Ayodeji O., 2018, Towards Autonomous Robotic Systems. 19th Annual Conference, TAROS 2018 Proceedings: Lecture Notes in Artificial Intelligence (LNAI 10965), P423, DOI 10.1007/978-3-319-96728-8_36
  • [2] The performance and cognitive workload analysis of a multimodal speech and visual gesture (mSVG) UAV control interface
    Abioye, Ayodeji Opeyemi
    Prior, Stephen D.
    Saddington, Peter
    Ramchurn, Sarvapali D.
    [J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2022, 147
  • [3] Al Moubayed Samer, 2012, Cognitive Behavioural Systems (COST 2012). International Training School. Revised Selected Papers, P114, DOI 10.1007/978-3-642-34584-5_9
  • [4] Spontaneous Spoken Dialogues with the Furhat Human-like Robot Head
    Al Moubayed, Samer
    Beskow, Jonas
    Skantze, Gabriel
    [J]. HRI'14: PROCEEDINGS OF THE 2014 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, 2014, : 326 - 326
  • [5] RAVEL: an annotated corpus for training robots with audiovisual abilities
    Alameda-Pineda, Xavier
    Sanchez-Riera, Jordi
    Wienke, Johannes
    Franc, Vojtech
    Cech, Jan
    Kulkarni, Kaustubh
    Deleforge, Antoine
    Horaud, Radu
    [J]. JOURNAL ON MULTIMODAL USER INTERFACES, 2013, 7 (1-2) : 79 - 91
  • [6] Andrist S, 2019, ACMIEEE INT CONF HUM, P668, DOI [10.1109/HRI.2019.8673067, 10.1109/hri.2019.8673067]
  • [7] Andrist Sean, 2020, P AAAI FALL S ART IN
  • [8] [Anonymous], 2009, P 2009 INT C MULTIMO, DOI DOI 10.1145/1647314.1647323
  • [9] Azagra P, 2017, IEEE INT C INT ROBOT, P6134, DOI 10.1109/IROS.2017.8206514
  • [10] Multimodal Machine Learning: A Survey and Taxonomy
    Baltrusaitis, Tadas
    Ahuja, Chaitanya
    Morency, Louis-Philippe
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (02) : 423 - 443