Combination of autocorrelation-based features and projection measure technique for speaker identification

被引：11

作者：

Yuo, KH ^{[1
]}

Hwang, TH ^{[1
]}

Wang, HC ^{[1
]}

机构：

[1] Natl Tsing Hua Univ, Dept Elect Engn, Hsinchu 300, Taiwan

来源：

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 2005年 / 13卷 / 04期

关键词：

channel-normalization; relative autocorrelation sequence; projection measure; speaker identification;

D O I：

10.1109/TSA.2005.848893

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents a robust approach for speaker identification when the speech signal is corrupted by additive noise and channel distortion. Robust features are derived by assuming that the corrupting noise is stationary and the channel effect is fixed during an utterance. A two-step temporal filtering procedure on the autocorrelation sequence is proposed to minimize the effect of additive and convolutional noises. The first step applies a temporal filtering procedure in autocorrelation domain to remove the additive noise, and the second step is to perform the mean subtraction on the filtered autocorrelation sequence in logarithmic spectrum domain to remove the channel effect. No prior knowledge of noise characteristic is necessary. The additive noise can be a colored noise. Then the proposed robust feature is combined with the projection measure technique to gain further improvement in recognition accuracy. Experimental results show that the proposed method can significantly improve the performance of speaker identification task in noisy environment.

引用

页码：565 / 574

页数：10

共 25 条

[1] A general joint additive and convolutive bias compensation approach applied to noisy Lombard speech recognition [J].

Afify, M ;

Gong, YF ;

Haton, JP .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (06) :524-538

[2]

AVENDANO C, 1996, P ICSLP 96, V2, P889

[3] A Projection-Based Likelihood Measure for Speech Recognition in Noise [J].

Carlson, Beth A. ;

Clements, Mark A. .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (01) :97-102

[4] Channel-effect-cancellation method for speech recognition over telephone systems [J].

Chien, JT ;

Lee, LM ;

Wang, HC .

IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 1995, 142 (06) :395-399

[5] CEPSTRAL ANALYSIS TECHNIQUE FOR AUTOMATIC SPEAKER VERIFICATION [J].

FURUI, S .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1981, 29 (02) :254-272

[6] ROBUST SPEECH RECOGNITION IN ADDITIVE AND CONVOLUTIONAL NOISE USING PARALLEL MODEL COMBINATION [J].

GALES, MJF ;

YOUNG, SJ .

COMPUTER SPEECH AND LANGUAGE, 1995, 9 (04) :289-307

[7] SPEECH RECOGNITION IN NOISY ENVIRONMENTS - A SURVEY [J].

GONG, YF .

SPEECH COMMUNICATION, 1995, 16 (03) :261-291

[8] Robustness to telephone handset distortion in speaker recognition by discriminative feature design [J].

Heck, LP ;

Konig, Y ;

Sönmez, MK ;

Weintraub, M .

SPEECH COMMUNICATION, 2000, 31 (2-3) :181-192

[9] RASTA Processing of Speech [J].

Hermansky, Hynek ;

Morgan, Nelson .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (04) :578-589

[10] Linear prediction of the one-sided autocorrelation sequence for noisy speech recognition [J].

Hernando, J ;

Nadeu, C .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1997, 5 (01) :80-84

← 1 2 3 →