Missing-Feature Method for Speaker Recognition in Band-Restricted Conditions

被引：0

作者：

Kim, Wooil ^{[1
]}

Hansen, John H. L. ^{[1
]}

机构：

[1] Univ Texas Dallas, Erik Jonsson Sch Engn & Comp Sci, CRSS, Richardson, TX 75083 USA

来源：

INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5 | 2008年

关键词：

speaker recognition; band-limited; missing-feature; marginalization; high order extension;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this study, the missing-feature method is considered to address band-limited speech for speaker recognition. In an effort to mitigate possible degradation due to the general speaker independent model, a two-step reconstruction scheme is developed, where speaker class independent/dependent models are used separately. An advanced marginalization in the cepstral domain is proposed employing a high order extension method in order to address loss of model accuracy in the conventional method due to cepstrum truncation. To detect the cut-off regions from incoming speech, a blind mask estimation scheme is employed which uses a synthesized band-limited speech model. Experimental results on band-limited conditions indicate that our two-step reconstruction scheme with missing-feature processing is effective in improving in-set/out-of-set speaker recognition performance for band-limited speech, particularly in severely band-restricted conditions (i.e., 4.72% EER improvement in 2, 3, and 4kHz band-limited conditions over a conventional data-driven method). The improvement of the proposed marginalization method proves its effectiveness for acoustic model conversion by employing high order extension, showing 0.57% EER improvement over conventional marginalization.

引用

页码：1909 / 1912

页数：4

共 9 条

[1] Discriminative in-set/out-of-set speaker recognition [J].

Angkititrakul, Pongtep ;

Hansen, John H. L. .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (02) :498-508

[2] Robust automatic speech recognition with missing and unreliable acoustic data [J].

Cooke, M ;

Green, P ;

Josifovski, L ;

Vizinho, A .

SPEECH COMMUNICATION, 2001, 34 (03) :267-285

[3] Robust continuous speech recognition using parallel model combination [J].

Gales, MJF ;

Young, SJ .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1996, 4 (05) :352-359

[4]

HAKKINEN J, 2001, CRAC WORKSH AALB DEN

[5]

Kim W, 2006, INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, P2306

[6]

MORALES N, 2005, INTERSPEECH2005

[7] Data-driven environmental compensation for speech recognition: A unified approach [J].

Moreno, PJ ;

Raj, B ;

Stern, RM .

SPEECH COMMUNICATION, 1998, 24 (04) :267-285

[8] Reconstruction of missing features for robust speech recognition [J].

Raj, B ;

Seltzer, ML ;

Stern, RM .

SPEECH COMMUNICATION, 2004, 43 (04) :275-296

[9]

Reynolds DA, 2003, INT CONF ACOUST SPEE, P53

← 1 →