Assessment of Single-Channel Speech Enhancement Techniques for Speaker Identification under Mismatched Conditions

被引:0
|
作者
Sadjadi, Seyed Omid [1 ]
Hansen, John H. L. [1 ]
机构
[1] Univ Texas Dallas, CRSS, Dallas, TX 75230 USA
来源
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4 | 2010年
关键词
feature extraction; gammatone filterbank; Hilbert envelope; speaker identification; speech enhancement; RECOGNITION; SIGNAL;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
It is well known that MFCC based speaker identification (SID) systems easily break down under mismatched training and test conditions. In this paper, we report on a study that considers four different single-channel speech enhancement front-ends for robust SID under such conditions. Speech files from the YOHO database are corrupted with four types of noise including babble, car, factory, and white Gaussian at five SNR levels (0-20 dB), and processed using four speech enhancement techniques representing distinct classes of algorithms: spectral subtraction, statistical model-based, subspace, and Wiener filtering. Both processed and unprocessed files are submitted to a SID system trained on clean data. In addition, a new set of acoustic feature parameters based on Hilbert envelope of gammatone filterbank outputs are proposed and evaluated for SID task. Experimental results indicate that: (i) depending on the noise type and SNR level, the enhancement front-ends may help or hurt SID performance, (ii) the proposed feature significantly achieves higher SID accuracy compared to MFCCs under mismatched conditions.
引用
收藏
页码:2138 / 2141
页数:4
相关论文
共 50 条
  • [41] Joint Speech Enhancement and Speaker Identification Using Approximate Bayesian Inference
    Maina, Ciira Wa
    Walsh, John MacLaren
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (06): : 1517 - 1529
  • [42] Single-channel dereverberation by feature mapping using cascade neural networks for robust distant speaker identification and speech recognition
    Nugraha, Aditya Arie
    Yamamoto, Kazumasa
    Nakagawa, Seiichi
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014,
  • [43] Performance of single-channel speech enhancement algorithms on Mandarin listeners with different immersion conditions in New Zealand English
    Zhang, Yunqi C.
    Hioka, Yusuke
    Hui, C. T. Justine
    Watson, Catherine I.
    SPEECH COMMUNICATION, 2024, 157
  • [44] Single-channel speech enhancement method using reconstructive NMF with spectrotemporal speech presence probabilities
    Lee, Seongjae
    Han, David K.
    Ko, Hanseok
    APPLIED ACOUSTICS, 2017, 117 : 257 - 262
  • [45] Multiframe Maximum Likelihood Distortionless Response Filter for Single-Channel Speech Enhancement
    Zhao, Qingying
    Chen, Zhe
    Yin, Fuliang
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73
  • [46] NOISE ROBUST EXEMPLAR MATCHING WITH COUPLED DICTIONARIES FOR SINGLE-CHANNEL SPEECH ENHANCEMENT
    Yilmaz, Emre
    Baby, Deepak
    Van Hamme, Hugo
    2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 874 - 878
  • [47] FPGA Implementation of a Phase-Aware Single-Channel Speech Enhancement System
    Suman Samui
    Pragya Sahu
    Indrajit Chakrabarti
    Soumya K. Ghosh
    Circuits, Systems, and Signal Processing, 2017, 36 : 4688 - 4715
  • [49] PHASE ESTIMATION IN SINGLE-CHANNEL SPEECH ENHANCEMENT USING PHASE INVARIANCE CONSTRAINTS
    Pirolt, Michael
    Stahl, Johannes
    Mowlaee, Pejman
    Vorobiov, Vasili I.
    Barysenka, Siarhei Y.
    Davydov, Andrew G.
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5585 - 5589
  • [50] NOISE-ADAPTIVE DEEP NEURAL NETWORK FOR SINGLE-CHANNEL SPEECH ENHANCEMENT
    Chung, Hanwook
    Kim, Taesup
    Plourde, Eric
    Champagne, Benoit
    2018 IEEE 28TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2018,