GPT-Driven Gestures: Leveraging Large Language Models to Generate Expressive Robot Motion for Enhanced Human-Robot Interaction

被引：0

作者：

Roy, Liam ^{[1
]}

Croft, Elizabeth A. ^{[2
]}

Ramirez, Alex ^{[3
]}

Kulic, Dana ^{[1
]}

机构：

[1] Monash Univ, Clayton, Vic 3800, Australia

[2] Univ Victoria, Victoria, BC V8P 5C2, Canada

[3] Univ Calgary, Calgary, AB T2N 1N4, Canada

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2025年 / 10卷 / 05期

关键词：

Robots; Human-robot interaction; Robot motion; Crowdsourcing; Accuracy; Robot kinematics; Vectors; Manuals; Collaboration; Quadrupedal robots; Human-robot collaboration; multi-modal perception for HRI; gesture; posture and facial expressions; social HRI; natural machine motion;

D O I：

10.1109/LRA.2025.3547631

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Expressive robot motion is a form of nonverbal communication that enables robots to convey their internal states, fostering effective human-robot interaction. A key step in designing expressive robot motions is developing a mapping from the desired states the robot will express to the robot's hardware and available degrees of freedom (design space). This letter introduces a novel framework to autonomously generate this mapping by leveraging a large language model (LLM) to select motion parameters and their values for target robot states. We evaluate expressive robot body language displayed on a Unitree Go1 quadruped as generated by a Generative Pre-trained Transformer (GPT) provided with a set of adjustable motion parameters. Through a two-part study (N = 120), we compared LLM-generated expressive motions with both randomly selected and human-selected expressions. Our results show that participants viewing LLM-generated expressions achieve a significantly higher state classification accuracy over random baselines and perform comparably with human-generated expressions. Additionally, in our post-hoc analysis we find that the Earth Movers Distance provides a useful metric for identifying similar expressions in the design space that lead to classification confusion.

引用

页码：4172 / 4179

页数：8

共 25 条

[1] A flexible optimization-based method for synthesizing intent-expressive robot arm motion
Bodden, Christopher
Rakita, Daniel
Mutlu, Bilge
Gleicher, Michael
[J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2018, 37 (11) : 1376 - 1394
[2] Effects of nonverbal communication on efficiency and robustness in human-robot teamwork
Breazeal, C
Kidd, CD
Thomaz, AL
Hoffman, G
Berlin, M
[J]. 2005 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-4, 2005, : 383 - 388
[3] Learning Legible Motion from Human-Robot Interactions
Busch, Baptiste
Grizou, Jonathan
Lopes, Manuel
Stulp, Freek
[J]. INTERNATIONAL JOURNAL OF SOCIAL ROBOTICS, 2017, 9 (05) : 765 - 779
[4] Toward the Legibility of Multi-robot Systems
Capelli, Beatrice
Santos, Maria
Sabattini, Lorenzo
[J]. ACM TRANSACTIONS ON HUMAN-ROBOT INTERACTION, 2024, 13 (02)
[5] Geppetto: Enabling Semantic Design of Expressive Robot Behaviors
Desai, Ruta
Anderson, Fraser
Matejka, Justin
Coros, Stelian
McCann, James
Fitzmaurice, George
Grossman, Tovi
[J]. CHI 2019: PROCEEDINGS OF THE 2019 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 2019,
[6] Effects of Robot Motion on Human-Robot Collaboration
Dragan, Anca D.
Bauman, Shira
Forlizzi, Jodi
Srinivasa, Siddhartha S.
[J]. PROCEEDINGS OF THE 2015 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION (HRI'15), 2015, : 51 - 58
[7] Dubois M., 2020, P 29 IEEE INT C ROB, P1243
[8] Robot Communication Via Motion: A Study on Modalities for Robot-to-Human Communication in the Field
Fulton, Michael
Edge, Chelsey
Sattar, Junaed
[J]. ACM TRANSACTIONS ON HUMAN-ROBOT INTERACTION, 2022, 11 (02)
[9] Huang P., 2024, arXiv
[10] Inderbitzin Martin, 2011, Proceedings 2011 IEEE International Conference on Automatic Face & Gesture Recognition (FG 2011), P809, DOI 10.1109/FG.2011.5771353

← 1 2 3 →