E-DSSR: Efficient Dynamic Surgical Scene Reconstruction with Transformer-Based Stereoscopic Depth Perception

被引:40
作者
Long, Yonghao [1 ]
Li, Zhaoshuo [2 ]
Yee, Chi Hang [3 ]
Ng, Chi Fai [3 ]
Taylor, Russell H. [2 ]
Unberath, Mathias [2 ]
Dou, Qi [1 ,4 ]
机构
[1] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Shatin, Hong Kong, Peoples R China
[2] Johns Hopkins Univ, Dept Comp Sci, Baltimore, MD 21218 USA
[3] Chinese Univ Hong Kong, SH Ho Urol Ctr, Dept Surg, Shatin, Hong Kong, Peoples R China
[4] Chinese Univ Hong Kong, T Stone Robot Inst, Shatin, Hong Kong, Peoples R China
来源
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT IV | 2021年 / 12904卷
关键词
Dynamic surgical scene reconstruction; Transformer-based depth estimation; Stereo image perception;
D O I
10.1007/978-3-030-87202-1_40
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reconstructing the scene of robotic surgery from the stereo endoscopic video is an important and promising topic in surgical data science, which potentially supports many applications such as surgical visual perception, robotic surgery education and intra-operative context awareness. However, current methods are mostly restricted to reconstructing static anatomy assuming no tissue deformation, tool occlusion and de-occlusion, and camera movement. However, these assumptions are not always satisfied in minimal invasive robotic surgeries. In this work, we present an efficient reconstruction pipeline for highly dynamic surgical scenes that runs at 28 fps. Specifically, we design a transformer-based stereoscopic depth perception for efficient depth estimation and a lightweight tool segmentor to handle tool occlusion. After that, a dynamic reconstruction algorithm which can estimate the tissue deformation and camera movement, and aggregate the information over time is proposed for surgical scene reconstruction. We evaluate the proposed pipeline on two datasets, the public Hamlyn Centre Endoscopic Video Dataset and our in-house DaVinci robotic surgery dataset. The results demonstrate that our method can recover the scene obstructed by the surgical tool and handle the movement of camera in realistic surgical scenarios effectively at real-time speed.
引用
收藏
页码:415 / 425
页数:11
相关论文
共 28 条
[1]  
Allan M., 2019, 190206426 ARXIV
[2]   Comparing the accuracy of the da Vinci Xi and da Vinci Si for image guidance and automation [J].
Ferguson, James M. ;
Pitt, Bryn ;
Kuntz, Alan ;
Granna, Josephine ;
Kavoussi, Nicholas L. ;
Nimmagadda, Naren ;
Barth, Eric J. ;
Herrell, Stanley Duke, III ;
Webster, Robert J., III .
INTERNATIONAL JOURNAL OF MEDICAL ROBOTICS AND COMPUTER ASSISTED SURGERY, 2020, 16 (06) :1-10
[3]  
Gao W., 2019, arXiv preprint arXiv:1909.06980
[4]   Unsupervised Monocular Depth Estimation with Left-Right Consistency [J].
Godard, Clement ;
Mac Aodha, Oisin ;
Brostow, Gabriel J. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6602-6611
[5]  
Hore Alain, 2010, Proceedings of the 2010 20th International Conference on Pattern Recognition (ICPR 2010), P2366, DOI 10.1109/ICPR.2010.579
[6]   Incorporating Temporal Prior from Motion Flow for Instrument Segmentation in Minimally Invasive Surgery Video [J].
Jin, Yueming ;
Cheng, Keyun ;
Dou, Qi ;
Heng, Pheng-Ann .
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT V, 2019, 11768 :440-448
[7]   DefSLAM: Tracking and Mapping of Deforming Scenes From Monocular Sequences [J].
Lamarca, Jose ;
Parashar, Shaifali ;
Bartoli, Adrien ;
Montiel, J. M. M. .
IEEE TRANSACTIONS ON ROBOTICS, 2021, 37 (01) :291-303
[8]   Robust Single-View Geometry and Motion Reconstruction [J].
Li, Hao ;
Adams, Bart ;
Guibas, Leonidas J. ;
Pauly, Mark .
ACM TRANSACTIONS ON GRAPHICS, 2009, 28 (05) :1-10
[9]   Unsupervised-Learning-Based Continuous Depth and Motion Estimation With Monocular Endoscopy for Virtual Reality Minimally Invasive Surgery [J].
Li, Ling ;
Li, Xiaojian ;
Yang, Shanlin ;
Ding, Shuai ;
Jolfaei, Alireza ;
Zheng, Xi .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2021, 17 (06) :3920-3928
[10]   SuPer: A Surgical Perception Framework for Endoscopic Tissue Manipulation With Surgical Robotics [J].
Li, Yang ;
Richter, Florian ;
Lu, Jingpei ;
Funk, Emily K. ;
Orosco, Ryan K. ;
Zhu, Jianke ;
Yip, Michael C. .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (02) :2294-2301