Development of a graphical user interface for automatic separation of human voice from Doppler ultrasound audio in diving experiments

被引:0
作者
Azarang, Arian [1 ]
Blogg, S. Lesley [2 ,3 ]
Currens, Joshua [4 ,5 ]
Lance, Rachel M. [6 ]
Moon, Richard E. [6 ]
Lindholm, Peter [3 ]
Papadopoulou, Virginie [4 ,5 ]
机构
[1] Univ North Carolina Chapel Hill, Biomed Engn Dept, Chapel Hill, NC 27599 USA
[2] SLB Consulting, Winton, Cumbria, England
[3] Univ Calif La Jolla, Sch Med, Dept Emergency Med, La Jolla, CA USA
[4] Univ North Carolina Chapel Hill, Joint Dept Biomed Engn, Chapel Hill, NC 27599 USA
[5] North Carolina State Univ, Raleigh, NC 27695 USA
[6] Duke Univ, Ctr Hyperbar Med & Environm Physiol, Durham, NC USA
关键词
SPEECH RECOGNITION; DECOMPRESSION;
D O I
10.1371/journal.pone.0283953
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Doppler ultrasound (DU) is used in decompression research to detect venous gas emboli in the precordium or subclavian vein, as a marker of decompression stress. This is of relevance to scuba divers, compressed air workers and astronauts to prevent decompression sickness (DCS) that can be caused by these bubbles upon or after a sudden reduction in ambient pressure. Doppler ultrasound data is graded by expert raters on the Kisman-Masurel or Spencer scales that are associated to DCS risk. Meta-analyses, as well as efforts to computer-automate DU grading, both necessitate access to large databases of well-curated and graded data. Leveraging previously collected data is especially important due to the difficulty of repeating large-scale extreme military pressure exposures that were conducted in the 70-90s in austere environments. Historically, DU data (Non-speech) were often captured on cassettes in one-channel audio with superimposed human speech describing the experiment (Speech). Digitizing and separating these audio files is currently a lengthy, manual task. In this paper, we develop a graphical user interface (GUI) to perform automatic speech recognition and aid in Non-speech and Speech separation. This constitutes the first study incorporating speech processing technology in the field of diving research. If successful, it has the potential to significantly accelerate the reuse of previously-acquired datasets. The recognition task incorporates the Google speech recognizer to detect the presence of human voice activity together with corresponding timestamps. The detected human speech is then separated from the audio Doppler ultrasound within the developed GUI. Several experiments were conducted on recently digitized audio Doppler recordings to corroborate the effectiveness of the developed GUI in recognition and separations tasks, and these are compared to manual labels for Speech timestamps. The following metrics are used to evaluate performance: the average absolute differences between the reference and detected Speech starting points, as well as the percentage of detected Speech over the total duration of the reference Speech. Results have shown the efficacy of the developed GUI in Speech/Non-speech component separation.
引用
收藏
页数:17
相关论文
共 45 条
[1]   Combining Data Augmentations for CNN-Based Voice Command Recognition [J].
Azarang, Arian ;
Hansen, John ;
Kehtarnavaz, Nasser .
2019 12TH INTERNATIONAL CONFERENCE ON HUMAN SYSTEM INTERACTION (HSI), 2019, :17-21
[2]  
Bano S, 2020, 2020 IEEE INT C INNO, P1
[3]  
Beniz D., 2016, Wepoprpo25, V9, P25
[4]  
Blogg S.L., 2012, European Underwater and Baromedical Society, P93
[5]  
Brubakk A O., 2003, Bennett and Elliott's Physiology and Medicine of Diving
[6]  
Celebre AM, 2015, 2015 INT C HUM NAN I, P1
[7]  
Chan W, 2016, INT CONF ACOUST SPEE, P4960, DOI 10.1109/ICASSP.2016.7472621
[8]  
Chen PY, 2020, INT CONF SPEECH DATA, P71, DOI [10.1109/o-cocosda50338.2020.9295005, 10.1109/O-COCOSDA50338.2020.9295005]
[9]   Comparative evaluation of three continuous speech recognition software packages in the generation of medical reports [J].
Devine, EG ;
Gaehde, SA ;
Curtis, AC .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2000, 7 (05) :462-468
[10]  
Doolette DJ, 2016, DIVING HYPERB MED, V46, P4