LLMR: Real-time Prompting of Interactive Worlds using Large Language Models

被引:13
作者
De la Torre, Fernanda [1 ,4 ]
Fang, Cathy Mengying [2 ,4 ]
Huang, Han [3 ,4 ]
Banburski-Fahey, Andrzej [4 ]
Fernandez, Judith Amores [4 ]
Lanier, Jaron [4 ]
机构
[1] MIT, Cambridge, MA 02139 USA
[2] MIT, Media Lab, Cambridge, MA 02139 USA
[3] Rensselaer Polytech Inst, Troy, NY 12181 USA
[4] Microsoft, Redmond, WA 98052 USA
来源
PROCEEDINGS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYTEMS (CHI 2024) | 2024年
关键词
large language model; mixed reality; spatial reasoning; artificial intelligence;
D O I
10.1145/3613904.3642579
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present Large Language Model for Mixed Reality (LLMR), a framework for the real-time creation and modification of interactive Mixed Reality experiences using LLMs. LLMR leverages novel strategies to tackle difficult cases where ideal training data is scarce, or where the design goal requires the synthesis of internal dynamics, intuitive analysis, or advanced interactivity. Our framework relies on text interaction and the Unity game engine. By incorporating techniques for scene understanding, task planning, self-debugging, and memory management, LLMR outperforms the standard GPT-4 by 4x in average error rate. We demonstrate LLMR's cross-platform interoperability with several example worlds, and evaluate it on a variety of creation and modification tasks to show that it can produce and edit diverse objects, tools, and scenes. Finally, we conducted a usability study (N=11) with a diverse set that revealed participants had positive experiences with the system and would use it again.
引用
收藏
页数:22
相关论文
共 65 条
[1]  
Austin Jacob, 2021, ARXIV
[2]  
Bae Sanghwan, 2022, ARXIV
[3]  
Bethke E., 2003, Game Development and Production
[4]   HexPlane: A Fast Representation for Dynamic Scenes [J].
Cao, Ang ;
Johnson, Justin .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, :130-141
[5]  
Christopoulou Fenia, 2022, ARXIV
[6]  
Dafara Stephanie Claudino, 2020, AUTHORIVE AUTHORING
[7]  
Driess D, 2023, ARXIV
[8]   A Survey on Remote Assistance and Training in Mixed Reality Environments [J].
Fidalgo, Catarina G. G. ;
Yan, Yukang ;
Cho, Hyunsung ;
Sousa, Mauricio ;
Lindlbauer, David ;
Jorge, Joaquim .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2023, 29 (05) :2291-2303
[9]  
Freiknecht Jonas, 2017, Multimodal Technologies and Interaction, V1, DOI 10.3390/mti1040027
[10]  
Gao K., 2022, ARXIV