Experimental Evaluation of Features for Robust Speaker Identification

被引:145
作者
Reynolds, Douglas A. [1 ]
机构
[1] MIT, Lincoln Lab, Lexington, MA 02173 USA
来源
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 1994年 / 2卷 / 04期
关键词
Communication channels (information theory) - Database systems - Digital filters - Identification (control systems) - Iterative methods - Mathematical models - Matrix algebra - Robustness (control systems) - Speech analysis - Speech processing - Vectors;
D O I
10.1109/89.326623
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This correspondence presents an experimental evaluation of different features and channel compensation techniques for robust speaker identification. The goal is to keep all processing and classification steps constant and to vary only the features and compensations used to allow a controlled comparison. A general, maximum-likelihood classifier based on Gaussian mixture densities is used as the classifier, and experiments are conducted on the King speech database, a conversational, telephone-speech database. The features examined are mel-frequency and linear-frequency filterbank cepstral coefficients, linear prediction cepstral coefficients, and perceptual linear prdiction (PLP) cepstral coefficients. The channel compensation techniques examined are cepstral mean removal, RASTA processing, and a quadratic trend removal technique. It is shown for this database that performance differences between the basic features is small, and the major gains are due to the channel compensation techniques. The best "across-the-divide" recognition accuracy of 92% is obtained for both high-order LPC features and band-limited filterbank features.
引用
收藏
页码:639 / 643
页数:5
相关论文
共 13 条
[1]   AUTOMATIC RECOGNITION OF SPEAKERS FROM THEIR VOICES [J].
ATAL, BS .
PROCEEDINGS OF THE IEEE, 1976, 64 (04) :460-475
[2]   COMPARISON OF PARAMETRIC REPRESENTATIONS FOR MONOSYLLABIC WORD RECOGNITION IN CONTINUOUSLY SPOKEN SENTENCES [J].
DAVIS, SB ;
MERMELSTEIN, P .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (04) :357-366
[3]   CEPSTRAL ANALYSIS TECHNIQUE FOR AUTOMATIC SPEAKER VERIFICATION [J].
FURUI, S .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1981, 29 (02) :254-272
[4]  
GISH H, 1990, P IEEE INT C AC SPEE, P289
[5]   PERCEPTUAL LINEAR PREDICTIVE (PLP) ANALYSIS OF SPEECH [J].
HERMANSKY, H .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1990, 87 (04) :1738-1752
[6]  
HERMANSKY H, 1992, P ICASSP 92, P124
[7]  
HIGGINS A, 1993, P INT C AC SPEECH SI, P375
[8]  
KAO Y, 1993, P INT C ASSP, P379
[9]  
MISTRETTA B, 1990, VCI5
[10]  
Reynolds D. A., 1992, THESIS GEORGIA I TEC