Inverse Kinematics Embedded Network for Robust Patient Anatomy Avatar Reconstruction From Multimodal Data

被引:2
作者
Zhou, Tongxi [1 ,2 ]
Chen, Mingcong [3 ,4 ]
Cao, Guanglin [1 ,2 ]
Hu, Jian [2 ,4 ]
Liu, Hongbin [2 ,4 ,5 ]
机构
[1] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
[2] Chinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence S, Beijing 100190, Peoples R China
[3] City Univ Hong Kong, Dept Biomed Engn, Hong Kong, Peoples R China
[4] Chinese Acad Sci, Hong Kong Inst Sci & Innovat, Ctr Artificial Intelligence & Robot, Hong Kong, Peoples R China
[5] Kings Coll London, Sch Biomed Engn & Imaging Sci, London SE1 7EU, England
关键词
Image reconstruction; Kinematics; Three-dimensional displays; Image color analysis; Biomedical imaging; Avatars; Solid modeling; Gesture; posture and facial expressions; deep learning for visual perception; modeling and simulating humans; RGB-D perception;
D O I
10.1109/LRA.2024.3366418
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Patient modelling has a wide range of applications in medicine and healthcare, such as clinical teaching, surgery navigation and automatic robotized scanning. While patients are typically covered or occluded in medical scenes, directly regressing human meshes from single RGB images is challenging. To this end, we design a deep learning-based patient anatomy reconstruction network from RGB-D images with three key modules: 1) the attention-based multimodal fusion module, 2) the analytical inverse kinematics module and 3) the anatomical layer module. In our pipeline, the color and depth modality are fully fused by the multimodal attention module to obtain a cover-insensitive feature map. The estimated 3D keypoints, learned from the fused feature, are further converted to patient model parameters through the embedded analytical inverse kinematics module. To capture more detailed patient structures, we also present a parametric anatomy avatar by extending the Skinned Multi-Person Linear Model (SMPL) with internal bone and artery models. Final meshes are driven by the predicted parameters via the anatomical layer module, generating digital twins of patients. Experimental results on the Simultaneously-Collected Multimodal Lying Pose Dataset demonstrate that our approach surpasses state-of-the-art human mesh recovery methods and shows robustness to occlusions.
引用
收藏
页码:3395 / 3402
页数:8
相关论文
共 18 条
[1]  
Anguelov D., 2021, Recognit, P3382
[2]  
Baerlocher P, 2001, INT FED INFO PROC, V68, P180
[3]   BodyPressure-Inferring Body Pose and Contact Pressure From a Depth Image [J].
Clever, Henry M. M. ;
Grady, Patrick L. L. ;
Turk, Greg ;
Kemp, Charles C. C. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) :137-153
[4]   3D reconstruction of the cerebral arterial network from stereotactic DSA [J].
Coste, E ;
Vasseur, C ;
Rousseau, J .
MEDICAL PHYSICS, 1999, 26 (09) :1783-1793
[5]  
Dobrowolski P, 2015, Arxiv, DOI arXiv:1506.05481
[6]   SMPL-A: Modeling Person-Specific Deformable Anatomy [J].
Guo, Hengtao ;
Planche, Benjamin ;
Zheng, Meng ;
Karanam, Srikrishna ;
Chen, Terrence ;
Wu, Ziyan .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :20782-20791
[7]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[8]   Towards Contactless Patient Positioning [J].
Karanam, Srikrishna ;
Li, Ren ;
Yang, Fan ;
Hu, Wei ;
Chen, Terrence ;
Wu, Ziyan .
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2020, 39 (08) :2701-2710
[9]  
Kelc R., 2012, WebmedCentral ANATOMY, V3, pWMC002903
[10]   OSSO: Obtaining Skeletal Shape from Outside [J].
Keller, Marilyn ;
Zuffi, Silvia ;
Black, Michael J. ;
Pujades, Sergi .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :20460-20469