Aggregating Long-Term Context for Learning Laparoscopic and Robot-Assisted Surgical Workflows

被引:13
作者
Ban, Yutong [1 ,2 ]
Rosman, Guy [1 ,3 ]
Ward, Thomas [2 ]
Hashimoto, Daniel [2 ]
Kondo, Taisei [4 ]
Iwaki, Hidekazu [4 ]
Meireles, Ozanan [2 ]
Rus, Daniela [1 ]
机构
[1] Comp Sci & Artificial Intelligence Lab, 32 Vassar St, Cambridge, MA 02139 USA
[2] Massachusetts Gen Hosp, Dept Surg, SAIIL, 55 Fruit St, Boston, MA 02114 USA
[3] Toyota Res Inst, Cambridge, MA 02139 USA
[4] Shinjuku Monolith, Olympus Corp, Shinjuku Ku, Tokyo, Japan
来源
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021) | 2021年
关键词
laparoscopic surgery; robot-assisted surgery; work flow recognition; temporal context aggregation; RECOGNITION;
D O I
10.1109/ICRA48506.2021.9561770
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Analyzing surgical workflow is crucial for surgical assistance robots to understand surgeries. With the understanding of the complete surgical workflow, the robots are able to assist the surgeons in intra-operative events, such as by giving a warning when the surgeon is entering specific keys or high-risk phases. Deep learning techniques have recently been widely applied to recognizing surgical workflows. Many of the existing temporal neural network models are limited in their capability to handle long-term dependencies in the data, instead, relying upon the strong performance of the underlying per-frame visual models. We propose a new temporal network structure that leverages task-specific network representation to collect long-term sufficient statistics that are propagated by a sufficient statistics model (SSM). We implement our approach within an LSTM backbone for the task of surgical phase recognition and explore several choices for propagated statistics. We demon-strate superior results over existing and novel state-of-the-art segmentation techniques on two laparoscopic cholecystectomy datasets: the publicly available Cholec80 dataset and MGH100, a novel dataset with more challenging and clinically meaningful segment labels.
引用
收藏
页码:14531 / 14538
页数:8
相关论文
共 46 条
[1]  
Aksamentov Ivan, 2017, Medical Image Computing and Computer-Assisted Intervention, MICCAI 2017. 20th International Conference. Proceedings: LNCS 10434, P586, DOI 10.1007/978-3-319-66185-8_66
[2]   Proximal Alternating Minimization and Projection Methods for Nonconvex Problems: An Approach Based on the Kurdyka-Lojasiewicz Inequality [J].
Attouch, Hedy ;
Bolte, Jerome ;
Redont, Patrick ;
Soubeyran, Antoine .
MATHEMATICS OF OPERATIONS RESEARCH, 2010, 35 (02) :438-457
[3]   Tracking Multiple Persons Based on a Variational Bayesian Model [J].
Ban, Yutong ;
Ba, Sileye ;
Alameda-Pineda, Xavier ;
Horaud, Radu .
COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 :52-67
[4]  
Ban Yutong, 2019, IEEE T PATTERN ANAL
[5]   LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT [J].
BENGIO, Y ;
SIMARD, P ;
FRASCONI, P .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02) :157-166
[6]  
Bengio Y, 2013, INT CONF ACOUST SPEE, P8624, DOI 10.1109/ICASSP.2013.6639349
[7]  
Czempiel T, 2020, ARXIV200310751
[8]  
Feyzabadi S., 2020, IEEE INT CONF ROBOT
[9]  
Gan C, 2015, PROC CVPR IEEE, P2568, DOI 10.1109/CVPR.2015.7298872
[10]  
Gao X., 2020, ARXIV200208718