SV-RCNet: Workflow Recognition From Surgical Videos Using Recurrent Convolutional Network

被引：213

作者：

Jin, Yueming ^{[1
]}

Dou, Qi ^{[1
]}

Chen, Hao ^{[1
]}

Yu, Lequan ^{[1
]}

Qin, Jing ^{[2
]}

Fu, Chi-Wing ^{[1
]}

Heng, Pheng-Ann ^{[1
]}

机构：

[1] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Hong Kong, Peoples R China

[2] Hong Kong Polytech Univ, Sch Nursing, Ctr Smart Hlth, Hong Kong, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON MEDICAL IMAGING | 2018年 / 37卷 / 05期

关键词：

Recurrent convolutional network; surgical workflow recognition; joint learning of spatio-temporal features; very deep residual network; long short-term memory; SEGMENTATION;

D O I：

10.1109/TMI.2017.2787657

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

We propose an analysis of surgical videos that is based on a novel recurrent convolutional network (SV-RCNet), specifically for automatic workflow recognition from surgical videos online, which is a key component for developing the context-aware computer-assisted intervention systems. Different from previous methods which harness visual and temporal information separately, the proposed SV-RCNet seamlessly integrates a convolutional neural network (CNN) and a recurrent neural network (RNN) to forma novel recurrent convolutional architecture in order to take full advantages of the complementary information of visual and temporal features learned from surgical videos. We effectively train the SV-RCNet in an end-to-end manner so that the visual representations and sequential dynamics can be jointly optimized in the learning process. In order to produce more discriminative spatio-temporal features, we exploit a deep residual network (ResNet) and a long short term memory (LSTM) network, to extract visual features and temporal dependencies, respectively, and integrate them into the SV-RCNet. Moreover, based on the phase transition-sensitive predictions from the SV-RCNet, we propose a simple yet effective inference scheme, namely the prior knowledge inference (PKI), by leveraging the natural characteristic of surgical video. Such a strategy further improves the consistency of results and largely boosts the recognition performance. Extensive experiments have been conducted with the MICCAI 2016 Modeling and Monitoring of Computer Assisted Interventions Workflow Challenge dataset and Cholec80 dataset to validate SV-RCNet. Our approach not only achieves superior performance on these two datasets but also outperforms the state-of-the-art methods by a significant margin.

引用

页码：1114 / 1126

页数：13

共 48 条

[1]

[Anonymous], 2017, A survey on deep learning in medical image analysis

[2]

[Anonymous], 2016, PROC CVPR IEEE, DOI [DOI 10.1109/CVPR.2016.319, 10.1109/CVPR.2016.319]

[3]

[Anonymous], 1987, Computational Limitations of Small-depth Circuits

[4]

Bardram JE, 2011, INT CONF PERVAS COMP, P45, DOI 10.1109/PERCOM.2011.5767594

[5]

Bhatia B., 2007, AAAI, V2, P1761

[6]

Bin Kong, 2016, Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016. 19th International Conference. Proceedings: LNCS 9902, P264, DOI 10.1007/978-3-319-46726-9_31

[7]

Blum T, 2010, LECT NOTES COMPUT SC, V6363, P400

[8]

Cadene R., 2016, M2CAI workflow challenge: Convolutional neural networks with time smoothing and hidden markov model for video frames classification

[9] Ultrasound Standard Plane Detection Using a Composite Neural Network Framework [J].

Chen, Hao ;

Wu, Lingyun ;

Dou, Qi ;

Qin, Jing ;

Li, Shengli ;

Cheng, Jie-Zhi ;

Ni, Dong ;

Heng, Pheng-Ann .

IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (06) :1576-1586

[10] Automatic Fetal Ultrasound Standard Plane Detection Using Knowledge Transferred Recurrent Neural Networks [J].