Toward Cognitive Digital Twin System of Human-Robot Collaboration Manipulation

被引:1
作者
Li, Xin [1 ,2 ]
He, Bin [1 ,2 ]
Wang, Zhipeng [1 ,2 ]
Zhou, Yanmin [1 ,2 ]
Li, Gang [3 ,4 ]
Li, Xiang [1 ,2 ]
机构
[1] Tongji Univ, Shanghai Res Inst Intelligent Autonomous Syst, Coll Elect & Informat Engn, Shanghai 201804, Peoples R China
[2] Tongji Univ, Frontiers Sci Ctr Intelligent Autonomous Syst, Shanghai 201804, Peoples R China
[3] Tongji Univ, Shanghai Res Inst Intelligent Autonomous Syst, Coll Elect & Informat Engn, Shanghai 201804, Peoples R China
[4] Shanghai Sunshine Rehabil Ctr, Shanghai 201613, Peoples R China
基金
中国国家自然科学基金;
关键词
Decision making; Robots; Semantics; Digital twins; Collaboration; Cognition; Solid modeling; Human-robot collaboration; multielement decision-making; large language models; digital twin; scene semantic graph; MODEL;
D O I
10.1109/TASE.2024.3452149
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multielement decision-making is crucial for the robust deployment of human-robot collaboration (HRC) systems in flexible manufacturing environments with personalized tasks and dynamic scenes. Large Language Models (LLMs) have recently demonstrated remarkable reasoning capabilities in various robotic tasks, potentially offering this capability. However, the application of LLMs to actual HRC systems requires the timely and comprehensive capturing of real-scene information. In this study, we suggest incorporating real scene data into LLMs using digital twin (DT) technology and present a cognitive digital twin prototype system of HRC manipulation, known as HRC-CogiDT. Specifically, we initially construct a scene semantic graph encoding the geometric information of entities, spatial relations between entities, actions of humans and robots, and collaborative activities. Subsequently, we devise a prompt that merges scene semantics with prior knowledge of activities, linking the real scene with LLMs. To evaluate performance, we compile an HRC scene understanding dataset and set up a laboratory-level experimental platform. Empirical results indicate that HRC-CogiDT can swiftly perceive scene changes and make high-level decisions based on varying task requirements, such as task planning, anomaly detection, and schedule reasoning. This study provides promising insights for the future applications of LLMs in robotics. Note to Practitioners-Recently, LLMs have demonstrated significant success in various robotic tasks, suggesting their potential as a powerful tool for robotic decision-making. Motivated by this, to improve the production efficiency of HRC in flexible manufacturing, we innovatively combine LLMs with DT technology, and propose a cognitive DT system for HRC, aiming to integrate LLMs into the decision-making loop of HRC system. Experiments conducted in a laboratory-scale platform indicate that the proposed system can handle different decision-making needs in different HRC activities. This system can provide professional guidance to operators in a comprehensible form and serve as a medium for monitoring the safety and standardization of the manipulation process. Future work will explore the use of virtual space provided by the proposed system to optimize the decision outputs of LLMs to make the proposed system more broadly applicable.
引用
收藏
页码:6677 / 6690
页数:14
相关论文
共 55 条
[1]  
[Anonymous], META LLAMA31
[2]  
Anthropic, CLAUDE30 CLAUDE AI
[3]  
Bazarevsky V., 2020, ARXIV
[4]  
Brohan A., 2023, P C ROB LEARN, P287, DOI DOI 10.48550/ARXIV.2204.01691
[5]  
Chen Hansheng, 2024, IEEE Trans Pattern Anal Mach Intell, VPP, DOI 10.1109/TPAMI.2024.3354997
[6]   An integrated mixed reality system for safety-aware human-robot collaboration using deep learning and digital twin generation [J].
Choi, Sung Ho ;
Park, Kyeong-Beom ;
Roh, Dong Hyeon ;
Lee, Jae Yeol ;
Mohammed, Mustafa ;
Ghasemi, Yalda ;
Jeong, Heejin .
ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2022, 73
[7]   Spatial-Temporal Transformer for Dynamic Scene Graph Generation [J].
Cong, Yuren ;
Liao, Wentong ;
Ackermann, Hanno ;
Rosenhahn, Bodo ;
Yang, Michael Ying .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :16352-16362
[8]   A study on picking objects in cluttered environments: Exploiting depth features for a custom low-cost universal jamming gripper [J].
D'Avella, Salvatore ;
Tripicchio, Paolo ;
Avizzano, Carlo Alberto .
ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2020, 63
[9]   A Multimodal Anomaly Detector for Robot-Assisted Feeding Using an LSTM-Based Variational Autoencoder [J].
Park, Daehyung ;
Hoshi, Yuuna ;
Kemp, Charles C. .
IEEE Robotics and Automation Letters, 2018, 3 (03) :1544-1551
[10]   A Hierarchical Architecture for Human-Robot Cooperation Processes [J].
Darvish, Kourosh ;
Simetti, Enrico ;
Mastrogiovanni, Fulvio ;
Casalino, Giuseppe .
IEEE TRANSACTIONS ON ROBOTICS, 2021, 37 (02) :567-586