Voice-Assisted Image Labeling for Endoscopic Ultrasound Classification Using Neural Networks

被引:11
作者
Bonmati, Ester [1 ,2 ]
Hu, Yipeng [1 ,2 ]
Grimwood, Alexander [1 ,2 ]
Johnson, Gavin J. [3 ]
Goodchild, George [3 ]
Keane, Margaret G. [3 ]
Gurusamy, Kurinchi [4 ]
Davidson, Brian [4 ]
Clarkson, Matthew J. [1 ,2 ]
Pereira, Stephen P. [5 ]
Barratt, Dean C. [1 ,2 ]
机构
[1] UCL, Wellcome EPSRC Ctr Intervent & Surg Sci WEISS, London W1W 7TS, England
[2] UCL, UCL Ctr Med Image Comp, London W1W 7TS, England
[3] Univ Coll London Hosp, Dept Gastroenterol, London NW1 2BU, England
[4] UCL, Div Surg & Intervent Sci, London W1W 7TS, England
[5] UCL, Inst Liver & Digest Hlth, London W1W 7TS, England
基金
英国工程与自然科学研究理事会;
关键词
Ultrasonic imaging; Training; Labeling; Standards; Real-time systems; Task analysis; Deep learning; Automatic labeling; classification; deep learning; ultrasound; voice;
D O I
10.1109/TMI.2021.3139023
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Ultrasound imaging is a commonly used technology for visualising patient anatomy in real-time during diagnostic and therapeutic procedures. High operator dependency and low reproducibility make ultrasound imaging and interpretation challenging with a steep learning curve. Automatic image classification using deep learning has the potential to overcome some of these challenges by supporting ultrasound training in novices, as well as aiding ultrasound image interpretation in patient with complex pathology for more experienced practitioners. However, the use of deep learning methods requires a large amount of data in order to provide accurate results. Labelling large ultrasound datasets is a challenging task because labels are retrospectively assigned to 2D images without the 3D spatial context available in vivo or that would be inferred while visually tracking structures between frames during the procedure. In this work, we propose a multi-modal convolutional neural network (CNN) architecture that labels endoscopic ultrasound (EUS) images from raw verbal comments provided by a clinician during the procedure. We use a CNN composed of two branches, one for voice data and another for image data, which are joined to predict image labels from the spoken names of anatomical landmarks. The network was trained using recorded verbal comments from expert operators. Our results show a prediction accuracy of 76% at image level on a dataset with 5 different labels. We conclude that the addition of spoken commentaries can increase the performance of ultrasound image classification, and eliminate the burden of manually labelling large EUS datasets necessary for deep learning applications.
引用
收藏
页码:1311 / 1319
页数:9
相关论文
共 31 条
  • [1] Convolutional Neural Networks for Speech Recognition
    Abdel-Hamid, Ossama
    Mohamed, Abdel-Rahman
    Jiang, Hui
    Deng, Li
    Penn, Gerald
    Yu, Dong
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (10) : 1533 - 1545
  • [2] Alishahi Afra, 2017, P 21 C COMPUTATIONAL, P368
  • [3] Captioning Ultrasound Images Automatically
    Alsharid, Mohammad
    Sharma, Harshita
    Drukker, Lior
    Chatelain, Pierre
    Papageorghiou, Aris T.
    Noble, J. Alison
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT IV, 2019, 11767 : 338 - 346
  • [4] [Anonymous], 2017, ARXIV2017170605870
  • [5] Detection of prostate cancer using temporal sequences of ultrasound data: a large clinical feasibility study
    Azizi, Shekoofeh
    Imani, Farhad
    Ghavidel, Sahar
    Tahmasebi, Amir
    Kwak, Jin Tae
    Xu, Sheng
    Turkbey, Baris
    Choyke, Peter
    Pinto, Peter
    Wood, Bradford
    Mousavi, Parvin
    Abolmaesumi, Purang
    [J]. INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2016, 11 (06) : 947 - 956
  • [6] A Review of Computer-Based Simulators for Ultrasound Training
    Blum, Tobias
    Rieger, Andreas
    Navab, Nassir
    Friess, Helmut
    Martignoni, Marc
    [J]. SIMULATION IN HEALTHCARE-JOURNAL OF THE SOCIETY FOR SIMULATION IN HEALTHCARE, 2013, 8 (02): : 98 - 108
  • [7] Automatic segmentation method of pelvic floor levator hiatus in ultrasound using a self-normalizing neural network
    Bonmati, Ester
    Hu, Yipeng
    Sindhwani, Nikhil
    Dietz, Hans Peter
    D'hooge, Jan
    Barratt, Dean
    Deprest, Jan
    Vercauteren, Tom
    [J]. JOURNAL OF MEDICAL IMAGING, 2018, 5 (02)
  • [8] The Segmentation of the Left Ventricle of the Heart From Ultrasound Data Using Deep Learning Architectures and Derivative-Based Search Methods
    Carneiro, Gustavo
    Nascimento, Jacinto C.
    Freitas, Antonio
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2012, 21 (03) : 968 - 982
  • [9] Outcomes of three different ways to train medical students as ultrasound tutors
    Celebi, Nora
    Griewatz, Jan
    Malek, Nisar Peter
    Hoffmann, Tatjana
    Walter, Carina
    Muller, Reinhold
    Riessen, Reimer
    Pauluschke-Froehlich, Jan
    Debove, Ines
    Zipfel, Stephan
    Froehlich, Eckhart
    [J]. BMC MEDICAL EDUCATION, 2019, 19 (1)
  • [10] Standard Plane Localization in Fetal Ultrasound via Domain Transferred Deep Neural Networks
    Chen, Hao
    Ni, Dong
    Qin, Jing
    Li, Shengli
    Yang, Xin
    Wang, Tianfu
    Heng, Pheng Ann
    [J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2015, 19 (05) : 1627 - 1636