Development of a machine-learning based voice disorder screening tool

被引:16
作者
Reid, Jonathan [1 ]
Parmar, Preet [2 ]
Lund, Tyler [3 ]
Aalto, Daniel K. [4 ]
Jeffery, Caroline C. [1 ,4 ]
机构
[1] Univ Alberta, Fac Med & Dent, Dept Surg, Div Otolaryngol Head & Neck Surg, Edmonton, AB, Canada
[2] Univ Alberta, Fac Sci, Dept Phys, Edmonton, AB, Canada
[3] Univ Alberta, Fac Engn, Edmonton, AB, Canada
[4] Univ Alberta, Fac Rehabil Med, Commun Sci & Disorders, Edmonton, AB, Canada
关键词
Machine learning; Voice disorders; Dysphonia; CANCER; CLASSIFICATION; PREVALENCE; PATHOLOGY; PROGNOSIS; SPEECH;
D O I
10.1016/j.amjoto.2021.103327
中图分类号
R76 [耳鼻咽喉科学];
学科分类号
100213 ;
摘要
Objective: Early recognition and referral are crucial for voice disorder management. Limited availability of subspecialists, poor primary care awareness, and the need for specialized equipment impede effective care. Thus, there is a need for a tool to improve voice pathology screening. Machine learning algorithms (MLAs) have shown promise in analyzing acoustic characteristics of phonation. However, few studies report clinical applications of MLAs for voice pathology detection. The objective of this study was to design and validate a MLA for detecting pathological voices.Methods: A MLA was developed for voice analysis. Audio samples converted into spectrograms were inputted into a pre-existing VGG19 convolutional neural network (CNN) and image-classifier. The resulting feature map was classified as either pathological or healthy using a Support Vector Machine (SVM) binary linear classifier. This combined MLA was "trained" with 950 sustained "/i/" vowel audio samples from the Saarbrucken Voice Database (SVD), which contains subjects with and without voice disorders. The trained MLA was "tested" with 406 SVD samples to determine sensitivity, specificity, and overall accuracy. External validation of the MLA was performed using clinical voice samples collected from patients attending a subspecialty voice clinic.Results: The MLA detected pathologies in SVD samples with 98.5% sensitivity, 97.1% specificity and 97.8% overall accuracy. In 30 samples obtained prospectively from voice clinic patients, the MLA detected pathologies with 100% sensitivity, 96.3% specificity and 96.7% overall accuracy.Conclusions: This study demonstrates that a MLA using a simple audio input can detect diverse vocal pathologies with high sensitivity and specificity. Thus, this algorithm shows promise as a potential screening tool.
引用
收藏
页数:5
相关论文
共 36 条
  • [1] Artificial intelligence as the next step towards precision pathology
    Acs, B.
    Rantalainen, M.
    Hartman, J.
    [J]. JOURNAL OF INTERNAL MEDICINE, 2020, 288 (01) : 62 - 81
  • [2] An Investigation of Multidimensional Voice Program Parameters in Three Different Databases for Voice Pathology Detection and Classification
    Al-nasheri, Ahmed
    Muhammad, Ghulam
    Alsulaiman, Mansour
    Ali, Zulfiqar
    Mesallam, Tamer A.
    Farahat, Mohamed
    Malki, Khalid H.
    Bencherif, Mohamed A.
    [J]. JOURNAL OF VOICE, 2017, 31 (01) : 113.e9 - 113.e18
  • [3] Automatic Voice Pathology Detection With Running Speech by Using Estimation of Auditory Spectrum and Cepstral Coefficients Based on the All-Pole Model
    Ali, Zulfiqar
    Elamvazuthi, Irraivan
    Alsulaiman, Mansour
    Muhammad, Ghulam
    [J]. JOURNAL OF VOICE, 2016, 30 (06) : 757.e7 - 757.e19
  • [4] Voice pathology detection based on the modified voice contour and SVM
    Ali, Zulfiqar
    Alsulaiman, Mansour
    Elamvazuthi, Irraivan
    Muhammad, Ghulam
    Mesallam, Tamer A.
    Farahat, Mohamed
    Malki, Khalid H.
    [J]. BIOLOGICALLY INSPIRED COGNITIVE ARCHITECTURES, 2016, 15 : 10 - 18
  • [5] [Anonymous], 2015, 2015 5 NAT S INF TEC
  • [6] Machine-Learning Analysis of Voice Samples Recorded through Smartphones: The Combined Effect of Ageing and Gender
    Asci, Francesco
    Costantini, Giovanni
    Di Leo, Pietro
    Zampogna, Alessandro
    Ruoppolo, Giovanni
    Berardelli, Alfredo
    Saggio, Giovanni
    Suppa, Antonio
    [J]. SENSORS, 2020, 20 (18) : 1 - 17
  • [7] A new pitch-range based feature set for a speaker's age and gender classification
    Barkana, Buket D.
    Zhou, Jingcheng
    [J]. APPLIED ACOUSTICS, 2015, 98 : 52 - 61
  • [8] Barry W, 2007, SAARBRUCKEN VOICE DA
  • [9] The Prevalence of Voice Problems Among Adults in the United States
    Bhattacharyya, Neil
    [J]. LARYNGOSCOPE, 2014, 124 (10) : 2359 - 2362
  • [10] Prediction of menarcheal status of girls using voice features
    Bugdol, Marcin D.
    Bugdol, Monika N.
    Lipowicz, Anna M.
    Mitas, Andrzej W.
    Bienkowska, Maria J.
    Wijata, Agata M.
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2018, 100 : 296 - 304