Effect of Nonlinear Compression Function on the Performance of the Speaker Identification System under Noisy Conditions

被引:6
作者
Jawarkar, Naresh P. [1 ]
Holambe, Raghunath S. [2 ]
Basu, Tapan Kumar [3 ]
机构
[1] Babasaheb Naik Coll Engn, Pusad, MS, India
[2] SGGS Inst Engn & Technol, Nanded, MS, India
[3] Acad Technol, Hooghly, WB, India
来源
PERCEPTION AND MACHINE INTELLIGENCE, 2015 | 2015年
关键词
Speaker identification; noisy environment; GMM; GFCC; MFCC; CEPSTRAL ANALYSIS; RECOGNITION; FEATURES;
D O I
10.1145/2708463.2709049
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The accurate speaker identification is difficult due to a number of factors. One of the most prominent factors is environmental noise. In this paper, the effect of two nonlinear compression functions, namely log and cubic root used in the feature extraction process, on the performance of the closed set text-independent speaker identification system under clean-and noisy-speaking environments is addressed. Performance is analyzed with Mel frequency cepstral coefficients (MFCC) and Gammatone frequency cepstral coefficients (GFCC). The Gaussian mixture model approach is used for speaker modeling. Two databases, namely, Marathi and Hindi databases were used for the experimentation. It has been observed that the cubic-root based features outperform the log based features under noisy conditions with SNR < 20 dB.
引用
收藏
页码:137 / 144
页数:8
相关论文
共 25 条
[1]   ROOT CEPSTRAL ANALYSIS - A UNIFIED VIEW - APPLICATION TO SPEECH PROCESSING IN CAR NOISE ENVIRONMENTS [J].
ALEXANDRE, P ;
LOCKWOOD, P .
SPEECH COMMUNICATION, 1993, 12 (03) :277-288
[2]  
[Anonymous], P ICSLP
[3]   EFFECTIVENESS OF LINEAR PREDICTION CHARACTERISTICS OF SPEECH WAVE FOR AUTOMATIC SPEAKER IDENTIFICATION AND VERIFICATION [J].
ATAL, BS .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1974, 55 (06) :1304-1312
[4]   Score normalization for text-independent speaker verification systems [J].
Auckenthaler, R ;
Carey, M ;
Lloyd-Thomas, H .
DIGITAL SIGNAL PROCESSING, 2000, 10 (1-3) :42-54
[5]  
Barras C, 2003, INT CONF ACOUST SPEE, P49
[6]   Speaker recognition: A tutorial [J].
Campbell, JP .
PROCEEDINGS OF THE IEEE, 1997, 85 (09) :1437-1462
[7]  
Fukunaga K, 1990, INTRO STAT PATTERN R, V2nd
[8]   CEPSTRAL ANALYSIS TECHNIQUE FOR AUTOMATIC SPEAKER VERIFICATION [J].
FURUI, S .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1981, 29 (02) :254-272
[9]  
Gold B., 2002, SPEECH AUDIO SIGNAL
[10]   SOURCE GENERATOR EQUALIZATION AND ENHANCEMENT OF SPECTRAL PROPERTIES FOR ROBUST SPEECH RECOGNITION IN NOISE AND STRESS [J].
HANSEN, JHL ;
CLEMENTS, MA .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (05) :407-415