Image Compositing for Segmentation of Surgical Tools Without Manual Annotations

被引:29
作者
Garcia-Peraza-Herrera, Luis C. [1 ,2 ]
Fidon, Lucas [2 ]
D'Ettorre, Claudia [3 ]
Stoyanov, Danail [3 ]
Vercauteren, Tom [2 ]
Ourselin, Sebastien [2 ]
机构
[1] UCL, Dept Med Phys & Biomed Engn, London WC1E 6BT, England
[2] Kings Coll London, Dept Surg & Intervent Engn, London WC2R 2LS, England
[3] UCL, Dept Comp Sci, London WC1E 6BT, England
基金
英国工程与自然科学研究理事会; 欧盟地平线“2020”;
关键词
Image segmentation; Instruments; Tools; Training; Task analysis; Surgery; Manuals; Image compositing; chroma key; tool segmentation;
D O I
10.1109/TMI.2021.3057884
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Producing manual, pixel-accurate, image segmentation labels is tedious and time-consuming. This is often a rate-limiting factor when large amounts of labeled images are required, such as for training deep convolutional networks for instrument-background segmentation in surgical scenes. No large datasets comparable to industry standards in the computer vision community are available for this task. To circumvent this problem, we propose to automate the creation of a realistic training dataset by exploiting techniques stemming from special effects and harnessing them to target training performance rather than visual appeal. Foreground data is captured by placing sample surgical instruments over a chroma key (a.k.a. green screen) in a controlled environment, thereby making extraction of the relevant image segment straightforward. Multiple lighting conditions and viewpoints can be captured and introduced in the simulation by moving the instruments and camera and modulating the light source. Background data is captured by collecting videos that do not contain instruments. In the absence of pre-existing instrument-free background videos, minimal labeling effort is required, just to select frames that do not contain surgical instruments from videos of surgical interventions freely available online. We compare different methods to blend instruments over tissue and propose a novel data augmentation approach that takes advantage of the plurality of options. We show that by training a vanilla U-Net on semi-synthetic data only and applying a simple post-processing, we are able to match the results of the same network trained on a publicly available manually labeled real dataset.
引用
收藏
页码:1450 / 1460
页数:11
相关论文
共 41 条
  • [11] Hedau V., 2011, P SIGGRAPH AS C SA, P1
  • [12] Real-time ultrasound transducer localization in fluoroscopy images by transfer learning from synthetic training data
    Heimann, Tobias
    Mountney, Peter
    John, Matthias
    Ionasec, Razvan
    [J]. MEDICAL IMAGE ANALYSIS, 2014, 18 (08) : 1320 - 1328
  • [13] Towards image guided robotic surgery: multi-arm tracking through hybrid localization
    Kwartowitz, David Morgan
    Miga, Michael I.
    Herrell, S. Duke
    Galloway, Robert L.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2009, 4 (03) : 281 - 286
  • [14] Microsoft COCO: Common Objects in Context
    Lin, Tsung-Yi
    Maire, Michael
    Belongie, Serge
    Hays, James
    Perona, Pietro
    Ramanan, Deva
    Dollar, Piotr
    Zitnick, C. Lawrence
    [J]. COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 : 740 - 755
  • [15] Maier-Hein L., 2016, Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016. 19th International Conference. Proceedings: LNCS 9901, P616, DOI 10.1007/978-3-319-46723-8_71
  • [16] Surgical data science for next-generation interventions
    Maier-Hein, Lena
    Vedula, Swaroop S.
    Speidel, Stefanie
    Navab, Nassir
    Kikinis, Ron
    Park, Adrian
    Eisenmann, Matthias
    Feussner, Hubertus
    Forestier, Germain
    Giannarou, Stamatia
    Hashizume, Makoto
    Katic, Darko
    Kenngott, Hannes
    Kranzfelder, Michael
    Malpani, Anand
    Maerz, Keno
    Neumuth, Thomas
    Padoy, Nicolas
    Pugh, Carla
    Schoch, Nicolai
    Stoyanov, Danail
    Taylor, Russell
    Wagner, Martin
    Hager, Gregory D.
    Jannin, Pierre
    [J]. NATURE BIOMEDICAL ENGINEERING, 2017, 1 (09): : 691 - 696
  • [17] Menglong Ye, 2016, Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016. 19th International Conference. Proceedings: LNCS 9900, P386, DOI 10.1007/978-3-319-46720-7_45
  • [18] How Useful Is Photo-Realistic Rendering for Visual Learning?
    Movshovitz-Attias, Yair
    Kanade, Takeo
    Sheikh, Yaser
    [J]. COMPUTER VISION - ECCV 2016 WORKSHOPS, PT III, 2016, 9915 : 202 - 217
  • [19] Weakly supervised convolutional LSTM approach for tool tracking in laparoscopic videos
    Nwoye, Chinedu Innocent
    Mutter, Didier
    Marescaux, Jacques
    Padoy, Nicolas
    [J]. INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2019, 14 (06) : 1059 - 1067
  • [20] Pakhomov D., 2020, ARXIV200704505