An Overview of Speaker Identification: Accuracy and Robustness Issues

被引:188
作者
Togneri, Roberto [1 ]
Pullella, Daniel [1 ]
机构
[1] Univ Western Australia, Nedlands, WA 6009, Australia
关键词
Speaker recognition; Feature extraction; Robustness; Speech recognition; Automatic speech recognition; Noise measurement; SUPPORT VECTOR MACHINES; DATA SPEECH RECOGNITION; NOISE; FEATURES; NORMALIZATION; DISTORTION; TUTORIAL; TRACKING;
D O I
10.1109/MCAS.2011.941079
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper presents the main paradigms for speaker identification, and recent work on missing data methods to increase robustness. The feature extraction, speaker modeling and system classification are discussed. Evaluations of speaker identification performance subject to environmental noise are presented. While performance is impressive in clean speech conditions, there is rapid degradation with mismatched additive noise. Missing data methods can compensate against arbitrary disturbances and remove environmental mismatches. An overview of missing data methods is provided and applications to robust speaker identification summarized. Finally combined approaches involving bottom-up estimation and top-down processing are reviewed, and their significance discussed.
引用
收藏
页码:23 / 61
页数:39
相关论文
共 115 条
[1]  
Acero Alex, 1995, P IEEE AUT SPEECH RE, P146
[2]  
AHMED S, 1993, ADV NEURAL INFORMATI, V5, P393
[3]  
[Anonymous], P INT C AC SPEECH SI
[4]  
[Anonymous], 1992, Proc. ICASSP 1992
[5]  
[Anonymous], 2000, INTERSPEECH
[6]   New LP-Derived Features for Speaker Identification [J].
Assaleh, Khaled T. ;
Mammone, Richard J. .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (04) :630-638
[7]   EFFECTIVENESS OF LINEAR PREDICTION CHARACTERISTICS OF SPEECH WAVE FOR AUTOMATIC SPEAKER IDENTIFICATION AND VERIFICATION [J].
ATAL, BS .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1974, 55 (06) :1304-1312
[8]   AUTOMATIC RECOGNITION OF SPEAKERS FROM THEIR VOICES [J].
ATAL, BS .
PROCEEDINGS OF THE IEEE, 1976, 64 (04) :460-475
[9]   Score normalization for text-independent speaker verification systems [J].
Auckenthaler, R ;
Carey, M ;
Lloyd-Thomas, H .
DIGITAL SIGNAL PROCESSING, 2000, 10 (1-3) :42-54
[10]  
BARKER J, 2000, P INT C SPOK LANG PR, P270