Voice-Assisted Image Labeling for Endoscopic Ultrasound Classification Using Neural Networks

被引：11

作者：

Bonmati, Ester ^{[1
,2
]}

Hu, Yipeng ^{[1
,2
]}

Grimwood, Alexander ^{[1
,2
]}

Johnson, Gavin J. ^{[3
]}

Goodchild, George ^{[3
]}

Keane, Margaret G. ^{[3
]}

Gurusamy, Kurinchi ^{[4
]}

Davidson, Brian ^{[4
]}

Clarkson, Matthew J. ^{[1
,2
]}

Pereira, Stephen P. ^{[5
]}

Barratt, Dean C. ^{[1
,2
]}

机构：

[1] UCL, Wellcome EPSRC Ctr Intervent & Surg Sci WEISS, London W1W 7TS, England

[2] UCL, UCL Ctr Med Image Comp, London W1W 7TS, England

[3] Univ Coll London Hosp, Dept Gastroenterol, London NW1 2BU, England

[4] UCL, Div Surg & Intervent Sci, London W1W 7TS, England

[5] UCL, Inst Liver & Digest Hlth, London W1W 7TS, England

来源：

IEEE TRANSACTIONS ON MEDICAL IMAGING | 2022年 / 41卷 / 06期

基金：

英国工程与自然科学研究理事会;

关键词：

Ultrasonic imaging; Training; Labeling; Standards; Real-time systems; Task analysis; Deep learning; Automatic labeling; classification; deep learning; ultrasound; voice;

D O I：

10.1109/TMI.2021.3139023

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Ultrasound imaging is a commonly used technology for visualising patient anatomy in real-time during diagnostic and therapeutic procedures. High operator dependency and low reproducibility make ultrasound imaging and interpretation challenging with a steep learning curve. Automatic image classification using deep learning has the potential to overcome some of these challenges by supporting ultrasound training in novices, as well as aiding ultrasound image interpretation in patient with complex pathology for more experienced practitioners. However, the use of deep learning methods requires a large amount of data in order to provide accurate results. Labelling large ultrasound datasets is a challenging task because labels are retrospectively assigned to 2D images without the 3D spatial context available in vivo or that would be inferred while visually tracking structures between frames during the procedure. In this work, we propose a multi-modal convolutional neural network (CNN) architecture that labels endoscopic ultrasound (EUS) images from raw verbal comments provided by a clinician during the procedure. We use a CNN composed of two branches, one for voice data and another for image data, which are joined to predict image labels from the spoken names of anatomical landmarks. The network was trained using recorded verbal comments from expert operators. Our results show a prediction accuracy of 76% at image level on a dataset with 5 different labels. We conclude that the addition of spoken commentaries can increase the performance of ultrasound image classification, and eliminate the burden of manually labelling large EUS datasets necessary for deep learning applications.

引用

页码：1311 / 1319

页数：9

共 31 条

[1] Convolutional Neural Networks for Speech Recognition
Abdel-Hamid, Ossama
Mohamed, Abdel-Rahman
Jiang, Hui
Deng, Li
Penn, Gerald
Yu, Dong
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (10) : 1533 - 1545
[2] Alishahi Afra, 2017, P 21 C COMPUTATIONAL, P368
[3] Captioning Ultrasound Images Automatically
Alsharid, Mohammad
Sharma, Harshita
Drukker, Lior
Chatelain, Pierre
Papageorghiou, Aris T.
Noble, J. Alison
[J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT IV, 2019, 11767 : 338 - 346
[4] [Anonymous], 2017, ARXIV2017170605870
[5] Detection of prostate cancer using temporal sequences of ultrasound data: a large clinical feasibility study
Azizi, Shekoofeh
Imani, Farhad
Ghavidel, Sahar
Tahmasebi, Amir
Kwak, Jin Tae
Xu, Sheng
Turkbey, Baris
Choyke, Peter
Pinto, Peter
Wood, Bradford
Mousavi, Parvin
Abolmaesumi, Purang
[J]. INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2016, 11 (06) : 947 - 956
[6] A Review of Computer-Based Simulators for Ultrasound Training
Blum, Tobias
Rieger, Andreas
Navab, Nassir
Friess, Helmut
Martignoni, Marc
[J]. SIMULATION IN HEALTHCARE-JOURNAL OF THE SOCIETY FOR SIMULATION IN HEALTHCARE, 2013, 8 (02): : 98 - 108
[7] Automatic segmentation method of pelvic floor levator hiatus in ultrasound using a self-normalizing neural network
Bonmati, Ester
Hu, Yipeng
Sindhwani, Nikhil
Dietz, Hans Peter
D'hooge, Jan
Barratt, Dean
Deprest, Jan
Vercauteren, Tom
[J]. JOURNAL OF MEDICAL IMAGING, 2018, 5 (02)
[8] The Segmentation of the Left Ventricle of the Heart From Ultrasound Data Using Deep Learning Architectures and Derivative-Based Search Methods
Carneiro, Gustavo
Nascimento, Jacinto C.
Freitas, Antonio
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2012, 21 (03) : 968 - 982
[9] Outcomes of three different ways to train medical students as ultrasound tutors
Celebi, Nora
Griewatz, Jan
Malek, Nisar Peter
Hoffmann, Tatjana
Walter, Carina
Muller, Reinhold
Riessen, Reimer
Pauluschke-Froehlich, Jan
Debove, Ines
Zipfel, Stephan
Froehlich, Eckhart
[J]. BMC MEDICAL EDUCATION, 2019, 19 (1)
[10] Standard Plane Localization in Fetal Ultrasound via Domain Transferred Deep Neural Networks
Chen, Hao
Ni, Dong
Qin, Jing
Li, Shengli
Yang, Xin
Wang, Tianfu
Heng, Pheng Ann
[J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2015, 19 (05) : 1627 - 1636

← 1 2 3 4 →