Vocal frequency estimation and voicing state prediction with surface EMG pattern recognition

被引:14
作者
De Armas, Winston [1 ]
Mamun, Khondaker A. [1 ,2 ]
Chau, Tom [1 ,2 ]
机构
[1] Univ Toronto, Inst Biomat & Biomed Engn, Toronto, ON M5S 3G9, Canada
[2] Holland Bloorview Kids Rehabil Hosp, Bloorview Res Inst, Toronto, ON M4G 1R8, Canada
关键词
Fundamental frequency; EMG; Electrolarynx; Pitch modulation; Hands free; Voicing state; REHABILITATION; CLASSIFICATION; LARYNGECTOMY; SIGNAL; STRAP;
D O I
10.1016/j.specom.2014.04.004
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The majority of laryngectomees use the electrolarynx as their primary mode of verbal communication after total laryngectomy surgery. However, the archetypal electrolarynx suffers from a monotonous tone and the inconvenience of requiring manual control. This paper presents the potential of pattern recognition to support electrolarynx use by predicting fundamental frequency (F0) and voicing state (VS) from surface EMG of the infrahyoid and suprahyoid muscles, as well as from a respiratory trace. In this study, surface EMG signals from the infrahyoid and suprahyoid muscle groups and respiratory trace were collected from 10 able-bodied, adult males (18- 60 years old). Participants performed three kinds of vocal tasks tones, legatos and phrases. Signal features were extracted from the EMG and respiratory trace, and a Support Vector Machine (SVM) classifier with radial basis function kernels was employed to predict F0 and voicing state. An average root mean squared error of 2.81 +/- 0.6 semitones was achieved for the estimation of vocal frequency in the range of 90-360 Hz. An average cross-validation (CV) accuracy of 78.05 +/- 6.3% was achieved for the prediction of voicing state from EMG and 65.24 +/- 7.8% from the respiratory trace. The proposed method has the advantage of being non-invasive compared with studies that relied on intramuscular electrodes (invasive), while still maintaining an accuracy above chance. Pattern classification of neck-muscle surface EMG has merit in the prediction of fundamental frequency and voicing state during vocalization, encouraging further study of automatic pitch modulation for electrolarynges and silent speech interfaces. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:15 / 26
页数:12
相关论文
共 44 条
[2]  
Bishop C.M., 2006, PATTERN RECOGN, P325
[3]   Surface EMG in advanced hand prosthetics [J].
Castellini, Claudio ;
van der Smagt, Patrick .
BIOLOGICAL CYBERNETICS, 2009, 100 (01) :35-47
[4]   Classification ensembles for unbalanced class sizes in predictive toxicology [J].
Chen, JJ ;
Tsai, CA ;
Young, JF ;
Kodell, RL .
SAR AND QSAR IN ENVIRONMENTAL RESEARCH, 2005, 16 (06) :517-529
[5]  
Daubechies I., 1992, Ten lectures on wavelets, DOI DOI 10.1137/1.9781611970104
[6]   Design and implementation of a hands-free electrolarynx device controlled by neck strap muscle electromyographic activity [J].
Goldstein, EA ;
Heaton, JT ;
Kobler, JB ;
Stanley, GB ;
Hillman, RE .
IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2004, 51 (02) :325-332
[7]  
GRAY S, 1976, ARCH PHYS MED REHAB, V57, P140
[8]  
Hastie T, 2004, J MACH LEARN RES, V5, P1391
[9]  
Hillman R E, 1998, Ann Otol Rhinol Laryngol Suppl, V172, P1
[10]   Role of vertical larynx movement and cervical lordosis in F0 control [J].
Honda, K ;
Hirai, H ;
Masaki, S ;
Shimada, Y .
LANGUAGE AND SPEECH, 1999, 42 :401-411