HILBERT ENVELOPE BASED FEATURES FOR ROBUST SPEAKER IDENTIFICATION UNDER REVERBERANT MISMATCHED CONDITIONS

被引:0
作者
Sadjadi, Seyed Omid [1 ]
Hansen, John H. L. [1 ]
机构
[1] Univ Texas Dallas, CRSS, Richardson, TX 75080 USA
来源
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2011年
关键词
Gammatone filterbank; Hilbert envelope; mismatched conditions; reverberation suppression; speaker identification;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
It is well known that MFCC based speaker identification (SID) systems easily break down under mismatched training and test conditions. One such mismatch occurs when a SID system is trained on anechoic speech data, while test is carried out using reverberant data collected via a distant microphone. In this study, a new set of feature parameters based on the Hilbert envelope of Gammatone filterbank outputs is proposed to improve SID performance in the presence of room reverberation. Considering two distinct perceptual effects of reverberation on speech signals, i.e., coloration and long-term reverberation, two different compensation strategies are integrated within the feature extraction framework to effectively suppress the effects of reverberation. Experimental evaluation is performed using speech material from the TIMIT, four different measured room impulse responses (RIR) from Aachen impulse response (AIR) database, and a GMM-based SID system. Obtained results indicate significant improvement over the baseline system with MFCCs plus cepstral mean subtraction (CMS), confirming the effectiveness of the proposed feature parameters for SID under reverberant mismatched conditions.
引用
收藏
页码:5448 / 5451
页数:4
相关论文
共 46 条
[31]   Robust speaker identification system based on wavelet transform and Gaussian mixture model [J].
Hsieh, CT ;
Lai, E ;
Wang, YC .
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2003, 19 (02) :267-282
[32]   Combination of autocorrelation-based features and projection measure technique for speaker identification [J].
Yuo, KH ;
Hwang, TH ;
Wang, HC .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (04) :565-574
[33]   GMM/SVM N-best speaker identification under mismatch channel conditions [J].
Zeljkovic, Ilija ;
Haffner, Patrick ;
Amento, Brian ;
Wilpon, Jay .
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, :4129-4132
[34]   Effect of Nonlinear Compression Function on the Performance of the Speaker Identification System under Noisy Conditions [J].
Jawarkar, Naresh P. ;
Holambe, Raghunath S. ;
Basu, Tapan Kumar .
PERCEPTION AND MACHINE INTELLIGENCE, 2015, 2015, :137-144
[35]   Speaker identification using empirical mode decomposition-based voice activity detection algorithm under realistic conditions [J].
Rudramurthy, M.S. ;
Pathak, Nilabh Kumar ;
Prasad, V. Kamakshi ;
Kumaraswamy, R. .
Journal of Intelligent Systems, 2014, 23 (04) :405-421
[36]   Speaker identification under background noise using features extracted from steady vowel regions [J].
Vuppala, Anil Kumar ;
Rao, K. Sreenivasa .
INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2013, 27 (09) :781-792
[37]   Robust speaker identification system based on multilayer eigen-codebook vector quantization [J].
Hsieh, CT ;
Lai, E ;
Chen, WC .
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (05) :1185-1193
[38]   DNN-based Amplitude and Phase Feature Enhancement for Noise Robust Speaker Identification [J].
Oo, Zeyan ;
Kawakami, Yuta ;
Wang, Longbiao ;
Nakagawa, Seiichi ;
Xiao, Xiong ;
Iwahashi, Masahiro .
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, :2204-2208
[39]   Gammatone filterbank and symbiotic combination of amplitude and phase-based spectra for robust speaker verification under noisy conditions and compression artifacts [J].
M. Fedila ;
M. Bengherabi ;
A. Amrouche .
Multimedia Tools and Applications, 2018, 77 :16721-16739
[40]   Gammatone filterbank and symbiotic combination of amplitude and phase-based spectra for robust speaker verification under noisy conditions and compression artifacts [J].
Fedila, M. ;
Bengherabi, M. ;
Amrouche, A. .
MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (13) :16721-16739