Exploring Segment-Level Semantics for Online Phase Recognition From Surgical Videos

被引:22
作者
Ding, Xinpeng [1 ]
Li, Xiaomeng [1 ,2 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Elect & Comp Engn, Hong Kong, Peoples R China
[2] Hong Kong Univ Sci & Technol, Shenzhen Res Inst, Shenzhen 518057, Peoples R China
关键词
Surgery; Videos; Feature extraction; Semantics; Hidden Markov models; Task analysis; Convolution; Surgical video analysis; surgical phase recognition; REAL-TIME SEGMENTATION; WORKFLOW RECOGNITION; TASKS;
D O I
10.1109/TMI.2022.3182995
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Automatic surgical phase recognition plays a vital role in robot-assisted surgeries. Existing methods ignored a pivotal problem that surgical phases should be classified by learning segment-level semantics instead of solely relying on frame-wise information. This paper presents a segment-attentive hierarchical consistency network (SAHC) for surgical phase recognition from videos. The key idea is to extract hierarchical high-level semantic-consistent segments and use them to refine the erroneous predictions caused by ambiguous frames. To achieve it, we design a temporal hierarchical network to generate hierarchical high-level segments. Then, we introduce a hierarchical segment-frame attention module to capture relations between the low-level frames and high-level segments. By regularizing the predictions of frames and their corresponding segments via a consistency loss, the network can generate semantic-consistent segments and then rectify the misclassified predictions caused by ambiguous low-level frames. We validate SAHC on two public surgical video datasets, i.e., the M2CAI16 challenge dataset and the Cholec80 dataset. Experimental results show that our method outperforms previous state-of-the-arts and ablation studies prove the effectiveness of our proposed modules. Our code has been released at: https://github.com/xmed-lab/SAHC.
引用
收藏
页码:3309 / 3319
页数:11
相关论文
共 57 条
  • [21] Temporal Convolutional Networks for Action Segmentation and Detection
    Lea, Colin
    Flynn, Michael D.
    Vidal, Rene
    Reiter, Austin
    Hager, Gregory D.
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1003 - 1012
  • [22] An Improved Model for Segmentation and Recognition of Fine-grained Activities with Application to Surgical Training Tasks
    Lea, Colin
    Hager, Gregory D.
    Vidal, Rene
    [J]. 2015 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2015, : 1123 - 1129
  • [23] Multi-Hierarchical Category Supervision for Weakly-Supervised Temporal Action Localization
    Li, Guozhang
    Li, Jie
    Wang, Nannan
    Ding, Xinpeng
    Li, Zhifeng
    Gao, Xinbo
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 9332 - 9344
  • [24] MS-TCN plus plus : Multi-Stage Temporal Convolutional Network for Action Segmentation
    Li, Shijie
    Abu Farha, Yazan
    Liu, Yun
    Cheng, Ming-Ming
    Gall, Juergen
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (06) : 6647 - 6658
  • [25] Li Y., 2022, PROC AAAI, P4556
  • [26] SwinIR: Image Restoration Using Swin Transformer
    Liang, Jingyun
    Cao, Jiezhang
    Sun, Guolei
    Zhang, Kai
    Van Gool, Luc
    Timofte, Radu
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 1833 - 1844
  • [27] Feature Pyramid Networks for Object Detection
    Lin, Tsung-Yi
    Dollar, Piotr
    Girshick, Ross
    He, Kaiming
    Hariharan, Bharath
    Belongie, Serge
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 936 - 944
  • [28] Liu Z, 2021, Arxiv, DOI [arXiv:2103.14030, 10.48550/arXiv.2103.14030, DOI 10.48550/ARXIV.2103.14030, DOI 10.48550/ARXIV.2103.14030,ARXIV]
  • [29] Maddox WJ, 2019, ADV NEUR IN, V32
  • [30] Surgical data science for next-generation interventions
    Maier-Hein, Lena
    Vedula, Swaroop S.
    Speidel, Stefanie
    Navab, Nassir
    Kikinis, Ron
    Park, Adrian
    Eisenmann, Matthias
    Feussner, Hubertus
    Forestier, Germain
    Giannarou, Stamatia
    Hashizume, Makoto
    Katic, Darko
    Kenngott, Hannes
    Kranzfelder, Michael
    Malpani, Anand
    Maerz, Keno
    Neumuth, Thomas
    Padoy, Nicolas
    Pugh, Carla
    Schoch, Nicolai
    Stoyanov, Danail
    Taylor, Russell
    Wagner, Martin
    Hager, Gregory D.
    Jannin, Pierre
    [J]. NATURE BIOMEDICAL ENGINEERING, 2017, 1 (09): : 691 - 696