Exploring the feasibility of smart phone microphone for measurement of acoustic voice parameters and voice pathology screening

被引:58
作者
Uloza, Virgilijus [1 ]
Padervinskis, Evaldas [1 ]
Vegiene, Aurelija [1 ]
Pribuisiene, Ruta [1 ]
Saferis, Viktoras [2 ]
Vaiciukynas, Evaldas [3 ]
Gelzinis, Adas [3 ]
Verikas, Antanas [3 ,4 ]
机构
[1] Lithuanian Univ Hlth Sci, Dept Otolaryngol, LT-50009 Kaunas, Lithuania
[2] Lithuanian Univ Hlth Sci, Dept Phys Math & Biophys, LT-50009 Kaunas, Lithuania
[3] Kaunas Univ Technol, Dept Elect Power Syst, Kaunas, Lithuania
[4] Halmstad Univ, Dept Intelligent Syst, Halmstad, Sweden
关键词
Acoustic analysis; Voice screening; Smart phone; CLASSIFICATION; SPEECH; QUESTIONNAIRE; RELIABILITY; PREVALENCE; EFFICACY; PROGRAM;
D O I
10.1007/s00405-015-3708-4
中图分类号
R76 [耳鼻咽喉科学];
学科分类号
100213 ;
摘要
The objective of this study is to evaluate the reliability of acoustic voice parameters obtained using smart phone (SP) microphones and investigate the utility of use of SP voice recordings for voice screening. Voice samples of sustained vowel/a/obtained from 118 subjects (34 normal and 84 pathological voices) were recorded simultaneously through two microphones: oral AKG Perception 220 microphone and SP Samsung Galaxy Note3 microphone. Acoustic voice signal data were measured for fundamental frequency, jitter and shimmer, normalized noise energy (NNE), signal to noise ratio and harmonic to noise ratio using Dr. Speech software. Discriminant analysis-based Correct Classification Rate (CCR) and Random Forest Classifier (RFC) based Equal Error Rate (EER) were used to evaluate the feasibility of acoustic voice parameters classifying normal and pathological voice classes. Lithuanian version of Glottal Function Index (LT_GFI) questionnaire was utilized for self-assessment of the severity of voice disorder. The correlations of acoustic voice parameters obtained with two types of microphones were statistically significant and strong (r = 0.73-1.0) for the entire measurements. When classifying into normal/pathological voice classes, the Oral-NNE revealed the CCR of 73.7 % and the pair of SP-NNE and SP-shimmer parameters revealed CCR of 79.5 %. However, fusion of the results obtained from SP voice recordings and GFI data provided the CCR of 84.60 % and RFC revealed the EER of 7.9 %, respectively. In conclusion, measurements of acoustic voice parameters using SP microphone were shown to be reliable in clinical settings demonstrating high CCR and low EER when distinguishing normal and pathological voice classes, and validated the suitability of the SP microphone signal for the task of automatic voice analysis and screening.
引用
收藏
页码:3391 / 3399
页数:9
相关论文
共 37 条
[21]   Telephony-based voice pathology assessment using automated speech analysis [J].
Moran, RJ ;
Reilly, RB ;
de Chazal, P ;
Lacy, PD .
IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2006, 53 (03) :468-477
[22]   Multidirectional Regression (MDR)-Based Features for Automatic Voice Disorder Detection [J].
Muhammad, Ghulam ;
Mesallam, Tamer A. ;
Malki, Khalid H. ;
Farahat, Mohamed ;
Mahmood, Awais ;
Alsulaiman, Mansour .
JOURNAL OF VOICE, 2012, 26 (06) :817.e19-817.e27
[23]   Effects of unilateral vocal cord paralysis on objective voice measures obtained by Praat [J].
Oguz, Haldun ;
Demirci, Munir ;
Safak, Mustafa A. ;
Arslan, Necmi ;
Islam, Ahmet ;
Kargin, Selda .
EUROPEAN ARCHIVES OF OTO-RHINO-LARYNGOLOGY, 2007, 264 (03) :257-261
[24]   Validation of the Lithuanian Version of the Glottal Function Index [J].
Pribuisiene, Ruta ;
Baceviciene, Migle ;
Uloza, Virgilijus ;
Vegiene, Aurelija ;
Antuseva, Jelena .
JOURNAL OF VOICE, 2012, 26 (02) :E73-E78
[25]   LARGE POPULATION SPEAKER IDENTIFICATION USING CLEAN AND TELEPHONE SPEECH [J].
REYNOLDS, DA .
IEEE SIGNAL PROCESSING LETTERS, 1995, 2 (03) :46-48
[26]   Prevalence of voice disorders in teachers and the general population [J].
Roy, N ;
Merrill, RM ;
Thibeault, S ;
Parsa, RA ;
Gray, SD ;
Elaine, S .
JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2004, 47 (02) :281-293
[27]   Methodological issues in the development of automatic systems for voice pathology detection [J].
Saenz-Lechon, Nicolas ;
Godino-Llorente, Juan I. ;
Osma-Ruiz, Victor ;
Gomez-Vilda, Pedro .
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2006, 1 (02) :120-128
[28]   A comparative study of acoustic voice measurements by means of Dr. Speech and computerized speech lab [J].
Smits, I ;
Ceuppens, P ;
De Bodt, TS .
JOURNAL OF VOICE, 2005, 19 (02) :187-196
[29]   An Introduction to Recursive Partitioning: Rationale, Application, and Characteristics of Classification and Regression Trees, Bagging, and Random Forests [J].
Strobl, Carolin ;
Malley, James ;
Tutz, Gerhard .
PSYCHOLOGICAL METHODS, 2009, 14 (04) :323-348
[30]   Guidelines for Selecting Microphones for Human Voice Production Research [J].
Svec, Jan G. ;
Granqvist, Svante .
AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY, 2010, 19 (04) :356-368