Assessment of Single-Channel Speech Enhancement Techniques for Speaker Identification under Mismatched Conditions

被引:0
|
作者
Sadjadi, Seyed Omid [1 ]
Hansen, John H. L. [1 ]
机构
[1] Univ Texas Dallas, CRSS, Dallas, TX 75230 USA
来源
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4 | 2010年
关键词
feature extraction; gammatone filterbank; Hilbert envelope; speaker identification; speech enhancement; RECOGNITION; SIGNAL;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
It is well known that MFCC based speaker identification (SID) systems easily break down under mismatched training and test conditions. In this paper, we report on a study that considers four different single-channel speech enhancement front-ends for robust SID under such conditions. Speech files from the YOHO database are corrupted with four types of noise including babble, car, factory, and white Gaussian at five SNR levels (0-20 dB), and processed using four speech enhancement techniques representing distinct classes of algorithms: spectral subtraction, statistical model-based, subspace, and Wiener filtering. Both processed and unprocessed files are submitted to a SID system trained on clean data. In addition, a new set of acoustic feature parameters based on Hilbert envelope of gammatone filterbank outputs are proposed and evaluated for SID task. Experimental results indicate that: (i) depending on the noise type and SNR level, the enhancement front-ends may help or hurt SID performance, (ii) the proposed feature significantly achieves higher SID accuracy compared to MFCCs under mismatched conditions.
引用
收藏
页码:2138 / 2141
页数:4
相关论文
共 50 条
  • [31] Biophysically-inspired single-channel speech enhancement in the time domain
    Wen, Chuan
    Verhulst, Sarah
    INTERSPEECH 2023, 2023, : 775 - 779
  • [32] New Results in Modulation-Domain Single-Channel Speech Enhancement
    Mowlaee, Pejman
    Blass, Martin
    Kleijn, W. Bastiaan
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (11) : 2125 - 2137
  • [33] A Comparative Study on Single-Channel Noise Estimation Methods for Speech Enhancement
    Veisi, Hadi
    Sameti, Hossein
    2012 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA), 2012, : 645 - 650
  • [34] Modulation-domain Kalman filtering for single-channel speech enhancement
    So, Stephen
    Paliwal, Kuldip K.
    SPEECH COMMUNICATION, 2011, 53 (06) : 818 - 829
  • [35] Two-Stage Temporal Processing for Single-Channel Speech Enhancement
    Samui, Sunzan
    Chakrabarti, Indrajit
    Ghosh, Soumya Kanti
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3723 - 3727
  • [36] Single-channel speech enhancement using Kalman filtering in the modulation domain
    So, Stephen
    Wojcicki, Kamil K.
    Paliwal, Kuldip K.
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 993 - 996
  • [37] Deep Learning with Augmented Kalman Filter for Single-Channel Speech Enhancement
    Roy, Sujan Kumar
    Nicolson, Aaron
    Paliwal, Kuldip K.
    2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
  • [38] Single-channel dereverberation by feature mapping using cascade neural networks for robust distant speaker identification and speech recognition
    Aditya Arie Nugraha
    Kazumasa Yamamoto
    Seiichi Nakagawa
    EURASIP Journal on Audio, Speech, and Music Processing, 2014
  • [39] INCORPORATING MULTI-CHANNEL WIENER FILTER WITH SINGLE-CHANNEL SPEECH ENHANCEMENT ALGORITHM
    Yong, Pei Chee
    Nordholm, Sven
    Dam, Hai Huyen
    Leung, Yee Hong
    Lai, Chiong Ching
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7284 - 7288
  • [40] Joint Speech Enhancement and Speaker Identification Using Monte Carlo Methods
    Maina, Ciira Wa
    Walsh, John MacLaren
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1359 - 1362