Missing-Feature Method for Speaker Recognition in Band-Restricted Conditions

被引:0
作者
Kim, Wooil [1 ]
Hansen, John H. L. [1 ]
机构
[1] Univ Texas Dallas, Erik Jonsson Sch Engn & Comp Sci, CRSS, Richardson, TX 75083 USA
来源
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5 | 2008年
关键词
speaker recognition; band-limited; missing-feature; marginalization; high order extension;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this study, the missing-feature method is considered to address band-limited speech for speaker recognition. In an effort to mitigate possible degradation due to the general speaker independent model, a two-step reconstruction scheme is developed, where speaker class independent/dependent models are used separately. An advanced marginalization in the cepstral domain is proposed employing a high order extension method in order to address loss of model accuracy in the conventional method due to cepstrum truncation. To detect the cut-off regions from incoming speech, a blind mask estimation scheme is employed which uses a synthesized band-limited speech model. Experimental results on band-limited conditions indicate that our two-step reconstruction scheme with missing-feature processing is effective in improving in-set/out-of-set speaker recognition performance for band-limited speech, particularly in severely band-restricted conditions (i.e., 4.72% EER improvement in 2, 3, and 4kHz band-limited conditions over a conventional data-driven method). The improvement of the proposed marginalization method proves its effectiveness for acoustic model conversion by employing high order extension, showing 0.57% EER improvement over conventional marginalization.
引用
收藏
页码:1909 / 1912
页数:4
相关论文
共 9 条
[1]   Discriminative in-set/out-of-set speaker recognition [J].
Angkititrakul, Pongtep ;
Hansen, John H. L. .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (02) :498-508
[2]   Robust automatic speech recognition with missing and unreliable acoustic data [J].
Cooke, M ;
Green, P ;
Josifovski, L ;
Vizinho, A .
SPEECH COMMUNICATION, 2001, 34 (03) :267-285
[3]   Robust continuous speech recognition using parallel model combination [J].
Gales, MJF ;
Young, SJ .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1996, 4 (05) :352-359
[4]  
HAKKINEN J, 2001, CRAC WORKSH AALB DEN
[5]  
Kim W, 2006, INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, P2306
[6]  
MORALES N, 2005, INTERSPEECH2005
[7]   Data-driven environmental compensation for speech recognition: A unified approach [J].
Moreno, PJ ;
Raj, B ;
Stern, RM .
SPEECH COMMUNICATION, 1998, 24 (04) :267-285
[8]   Reconstruction of missing features for robust speech recognition [J].
Raj, B ;
Seltzer, ML ;
Stern, RM .
SPEECH COMMUNICATION, 2004, 43 (04) :275-296
[9]  
Reynolds DA, 2003, INT CONF ACOUST SPEE, P53