Toward Cognitive Digital Twin System of Human-Robot Collaboration Manipulation

被引:1
作者
Li, Xin [1 ,2 ]
He, Bin [1 ,2 ]
Wang, Zhipeng [1 ,2 ]
Zhou, Yanmin [1 ,2 ]
Li, Gang [3 ,4 ]
Li, Xiang [1 ,2 ]
机构
[1] Tongji Univ, Shanghai Res Inst Intelligent Autonomous Syst, Coll Elect & Informat Engn, Shanghai 201804, Peoples R China
[2] Tongji Univ, Frontiers Sci Ctr Intelligent Autonomous Syst, Shanghai 201804, Peoples R China
[3] Tongji Univ, Shanghai Res Inst Intelligent Autonomous Syst, Coll Elect & Informat Engn, Shanghai 201804, Peoples R China
[4] Shanghai Sunshine Rehabil Ctr, Shanghai 201613, Peoples R China
基金
中国国家自然科学基金;
关键词
Decision making; Robots; Semantics; Digital twins; Collaboration; Cognition; Solid modeling; Human-robot collaboration; multielement decision-making; large language models; digital twin; scene semantic graph; MODEL;
D O I
10.1109/TASE.2024.3452149
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multielement decision-making is crucial for the robust deployment of human-robot collaboration (HRC) systems in flexible manufacturing environments with personalized tasks and dynamic scenes. Large Language Models (LLMs) have recently demonstrated remarkable reasoning capabilities in various robotic tasks, potentially offering this capability. However, the application of LLMs to actual HRC systems requires the timely and comprehensive capturing of real-scene information. In this study, we suggest incorporating real scene data into LLMs using digital twin (DT) technology and present a cognitive digital twin prototype system of HRC manipulation, known as HRC-CogiDT. Specifically, we initially construct a scene semantic graph encoding the geometric information of entities, spatial relations between entities, actions of humans and robots, and collaborative activities. Subsequently, we devise a prompt that merges scene semantics with prior knowledge of activities, linking the real scene with LLMs. To evaluate performance, we compile an HRC scene understanding dataset and set up a laboratory-level experimental platform. Empirical results indicate that HRC-CogiDT can swiftly perceive scene changes and make high-level decisions based on varying task requirements, such as task planning, anomaly detection, and schedule reasoning. This study provides promising insights for the future applications of LLMs in robotics. Note to Practitioners-Recently, LLMs have demonstrated significant success in various robotic tasks, suggesting their potential as a powerful tool for robotic decision-making. Motivated by this, to improve the production efficiency of HRC in flexible manufacturing, we innovatively combine LLMs with DT technology, and propose a cognitive DT system for HRC, aiming to integrate LLMs into the decision-making loop of HRC system. Experiments conducted in a laboratory-scale platform indicate that the proposed system can handle different decision-making needs in different HRC activities. This system can provide professional guidance to operators in a comprehensible form and serve as a medium for monitoring the safety and standardization of the manipulation process. Future work will explore the use of virtual space provided by the proposed system to optimize the decision outputs of LLMs to make the proposed system more broadly applicable.
引用
收藏
页码:6677 / 6690
页数:14
相关论文
共 55 条
[11]  
Ding Y., 2023, arXiv
[12]  
Ding Y., 2023, P RSS WORKSH LEARN T
[13]   Vision-based holistic scene understanding towards proactive human-robot collaboration [J].
Fan, Junming ;
Zheng, Pai ;
Li, Shufei .
ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2022, 75
[14]  
Hong Y., 2023, ARXIV
[15]   Visual Language Maps for Robot Navigation [J].
Huang, Chenguang ;
Mees, Oier ;
Zeng, Andy ;
Burgard, Wolfram .
2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2023), 2023, :10608-10615
[16]   PlanCollabNL: Leveraging Large Language Models for Adaptive Plan Generation in Human-Robot Collaboration [J].
Izquierdo-Badiola, Silvia ;
Canal, Gerard ;
Rizzo, Carlos ;
Alenya, Guillem .
2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2024), 2024, :17344-17350
[17]   A probabilistic graphical model foundation for enabling predictive digital twins at scale [J].
Kapteyn, Michael G. ;
Pretorius, Jacob V. R. ;
Willcox, Karen E. .
NATURE COMPUTATIONAL SCIENCE, 2021, 1 (05) :337-+
[18]   Survey of Human-Robot Collaboration in Industrial Settings: Awareness, Intelligence, and Compliance [J].
Kumar, Shitij ;
Savur, Celal ;
Sahin, Ferat .
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2021, 51 (01) :280-297
[19]   A Novel OpenMVS-Based Texture Reconstruction Method Based on the Fully Automatic Plane Segmentation for 3D Mesh Models [J].
Li, Shenhong ;
Xiao, Xiongwu ;
Guo, Bingxuan ;
Zhang, Lin .
REMOTE SENSING, 2020, 12 (23) :1-21
[20]   GroupFormer: Group Activity Recognition with Clustered Spatial-Temporal Transformer [J].
Li, Shuaicheng ;
Cao, Qianggang ;
Liu, Lingbo ;
Yang, Kunlin ;
Liu, Shinan ;
Hou, Jun ;
Yi, Shuai .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :13648-13657