Data-centric multi-task surgical phase estimation with sparse scene segmentation

被引:8
|
作者
Sanchez-Matilla, Ricardo [1 ]
Robu, Maria [1 ]
Grammatikopoulou, Maria [1 ]
Luengo, Imanol [1 ]
Stoyanov, Danail [1 ,2 ]
机构
[1] Digital Surg, London, England
[2] UCL, Wellcome EPSRC Ctr Intervent & Surg Sci, London, England
关键词
Surgical phases; Scene segmentation; Surgical data science; Multi-task;
D O I
10.1007/s11548-022-02616-0
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Purpose Surgical workflow estimation techniques aim to divide a surgical video into temporal segments based on predefined surgical actions or objectives, which can be of different granularity such as steps or phases. Potential applications range from real-time intra-operative feedback to automatic post-operative reports and analysis. A common approach in the literature for performing automatic surgical phase estimation is to decouple the problem into two stages: feature extraction from a single frame and temporal feature fusion. This approach is performed in two stages due to computational restrictions when processing large spatio-temporal sequences. Methods The majority of existing works focus on pushing the performance solely through temporal model development. Differently, we follow a data-centric approach and propose a training pipeline that enables models to maximise the usage of existing datasets, which are generally used in isolation. Specifically, we use dense phase annotations available in Cholec80, and sparse scene (i.e., instrument and anatomy) segmentation annotation available in CholecSeg8k in less than 5% of the overlapping frames. We propose a simple multi-task encoder that effectively fuses both streams, when available, based on their importance and jointly optimise them for performing accurate phase prediction. Results and conclusion We show that with a small fraction of scene segmentation annotations, a relatively simple model can obtain comparable results than previous state-of-the-art and more complex architectures when evaluated in similar settings. We hope that this data-centric approach can encourage new research directions where data, and how to use it, plays an important role along with model development.
引用
收藏
页码:953 / 960
页数:8
相关论文
共 50 条
  • [21] Wasserstein regularization for sparse multi-task regression
    Janati, Hicham
    Cuturi, Marco
    Gramfort, Alexandre
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [22] Data-centric approach for instance segmentation in optical waste sorting
    Iliushina, Anna
    Mazanov, Gleb
    Nesteruk, Sergey
    Pimenov, Andrey
    Stepanov, Anton
    Mikhaylova, Nadezhda
    Baldycheva, Anna
    Somov, Andrey
    WASTE MANAGEMENT, 2025, 191 : 70 - 80
  • [23] Multi-task Copula by Sparse Graph Regression
    Zhou, Tianyi
    Tao, Dacheng
    PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14), 2014, : 771 - 780
  • [24] CAENet: Efficient Multi-task Learning for Joint Semantic Segmentation and Depth Estimation
    Wang, Luxi
    Li, Yingming
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT V, 2023, 14173 : 408 - 425
  • [25] A weak edge estimation based multi-task neural network for OCT segmentation
    Yang, Fan
    Chen, Pu
    Lin, Shiqi
    Zhan, Tianming
    Hong, Xunning
    Chen, Yunjie
    PLOS ONE, 2025, 20 (01):
  • [26] A Multi-Task Vision Transformer for Segmentation and Monocular Depth Estimation for Autonomous Vehicles
    Bavirisetti, Durga Prasad
    Martinsen, Herman Ryen
    Kiss, Gabriel Hanssen
    Lindseth, Frank
    IEEE OPEN JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 4 : 909 - 928
  • [27] Multi-view representation learning in multi-task scene
    Run-kun Lu
    Jian-wei Liu
    Si-ming Lian
    Xin Zuo
    Neural Computing and Applications, 2020, 32 : 10403 - 10422
  • [28] Multi-view representation learning in multi-task scene
    Lu, Run-kun
    Liu, Jian-wei
    Lian, Si-ming
    Zuo, Xin
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (14): : 10403 - 10422
  • [29] Adaptive Budgeting for Collaborative Multi-Task Data Collection in Online Sparse Crowdsensing
    Tu, Chunyu
    Yu, Zhiyong
    Han, Lei
    Guo, Xianwei
    Huang, Fangwan
    Guo, Wenzhong
    Wang, Leye
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (07) : 7983 - 7998
  • [30] Joint Sparse Representation of Brain Activity Patterns in Multi-Task fMRI Data
    Ramezani, M.
    Marble, K.
    Trang, H.
    Johnsrude, I. S.
    Abolmaesumi, P.
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2015, 34 (01) : 2 - 12