Speaker identification in unknown noisy conditions - A universal compensation approach

被引:0
作者
Ming, J [1 ]
Stewart, D [1 ]
Vaseghi, S [1 ]
机构
[1] Queens Univ Belfast, Sch Comp Sci, Belfast BT7 1NN, Antrim, North Ireland
来源
2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING | 2005年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider speaker identification involving background noise, assuming no knowledge about the noise characteristics. A new method, namely universal compensation (UC), is studied as a solution to the problem. The UC method is all extension of the missing-feature method, i.e. recognition based only oil reliable data but robust to any corruption type, including full corruption that affects all time-frequency components of the speech representation. The UC technique achieves robustness to unknown, full noise corruption through a novel combination of the multi-condition training method and the missing-feature method. Multi-condition training is employed to convert full-band spectral corruption into partial-band spectral corruption, and the missing-feature principle is employed to reduce the effect of the remaining partial-band corruption on recognition by basing the recognition only oil the matched or least-distorted spectral components. The combination of these two strategies makes the new method potentially capable of dealing with arbitrary additive noise - with arbitrary temporal-spectral characteristics - based only on clean speech training data and simulated noise data, without requiring knowledge about the actual noise. The SPIDRE database is used for the evaluation, assuming various corruptions from real-world noise data. The results obtained are encouraging.
引用
收藏
页码:617 / 620
页数:4
相关论文
共 12 条
[1]   Score normalization for text-independent speaker verification systems [J].
Auckenthaler, R ;
Carey, M ;
Lloyd-Thomas, H .
DIGITAL SIGNAL PROCESSING, 2000, 10 (1-3) :42-54
[2]   Localization and selection of speaker-specific information with statistical modeling [J].
Besacier, L ;
Bonastre, JF ;
Fredouille, C .
SPEECH COMMUNICATION, 2000, 31 (2-3) :89-106
[3]  
Drygajlo A., ICASSP 98, P121
[4]   Robustness to telephone handset distortion in speaker recognition by discriminative feature design [J].
Heck, LP ;
Konig, Y ;
Sönmez, MK ;
Weintraub, M .
SPEECH COMMUNICATION, 2000, 31 (2-3) :181-192
[5]   Speaker recognition using HMM composition in noisy environments [J].
Matsui, T ;
Kanno, T ;
Furui, S .
COMPUTER SPEECH AND LANGUAGE, 1996, 10 (02) :107-116
[6]  
MING J, EUROSPEECH 2003, P2645
[7]  
MING J, ICSLP 2004
[8]  
ORTEGAGARCIA J, ICSLP 96, P929
[9]   Speaker verification using adapted Gaussian mixture models [J].
Reynolds, DA ;
Quatieri, TF ;
Dunn, RB .
DIGITAL SIGNAL PROCESSING, 2000, 10 (1-3) :19-41
[10]  
SUHADI S, EUROSPEECH 2003, P1669