Assessment of Single-Channel Speech Enhancement Techniques for Speaker Identification under Mismatched Conditions

被引：0

作者：

Sadjadi, Seyed Omid ^{[1
]}

Hansen, John H. L. ^{[1
]}

机构：

[1] Univ Texas Dallas, CRSS, Dallas, TX 75230 USA

来源：

11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4 | 2010年

关键词：

feature extraction; gammatone filterbank; Hilbert envelope; speaker identification; speech enhancement; RECOGNITION; SIGNAL;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

It is well known that MFCC based speaker identification (SID) systems easily break down under mismatched training and test conditions. In this paper, we report on a study that considers four different single-channel speech enhancement front-ends for robust SID under such conditions. Speech files from the YOHO database are corrupted with four types of noise including babble, car, factory, and white Gaussian at five SNR levels (0-20 dB), and processed using four speech enhancement techniques representing distinct classes of algorithms: spectral subtraction, statistical model-based, subspace, and Wiener filtering. Both processed and unprocessed files are submitted to a SID system trained on clean data. In addition, a new set of acoustic feature parameters based on Hilbert envelope of gammatone filterbank outputs are proposed and evaluated for SID task. Experimental results indicate that: (i) depending on the noise type and SNR level, the enhancement front-ends may help or hurt SID performance, (ii) the proposed feature significantly achieves higher SID accuracy compared to MFCCs under mismatched conditions.

引用

页码：2138 / 2141

页数：4

共 50 条

[31] Biophysically-inspired single-channel speech enhancement in the time domain
Wen, Chuan
Verhulst, Sarah
INTERSPEECH 2023, 2023, : 775 - 779
[32] New Results in Modulation-Domain Single-Channel Speech Enhancement
Mowlaee, Pejman
Blass, Martin
Kleijn, W. Bastiaan
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (11) : 2125 - 2137
[33] A Comparative Study on Single-Channel Noise Estimation Methods for Speech Enhancement
Veisi, Hadi
Sameti, Hossein
2012 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA), 2012, : 645 - 650
[34] Modulation-domain Kalman filtering for single-channel speech enhancement
So, Stephen
Paliwal, Kuldip K.
SPEECH COMMUNICATION, 2011, 53 (06) : 818 - 829
[35] Two-Stage Temporal Processing for Single-Channel Speech Enhancement
Samui, Sunzan
Chakrabarti, Indrajit
Ghosh, Soumya Kanti
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3723 - 3727
[36] Single-channel speech enhancement using Kalman filtering in the modulation domain
So, Stephen
Wojcicki, Kamil K.
Paliwal, Kuldip K.
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 993 - 996
[37] Deep Learning with Augmented Kalman Filter for Single-Channel Speech Enhancement
Roy, Sujan Kumar
Nicolson, Aaron
Paliwal, Kuldip K.
2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
[38] Single-channel dereverberation by feature mapping using cascade neural networks for robust distant speaker identification and speech recognition
Aditya Arie Nugraha
Kazumasa Yamamoto
Seiichi Nakagawa
EURASIP Journal on Audio, Speech, and Music Processing, 2014
[39] INCORPORATING MULTI-CHANNEL WIENER FILTER WITH SINGLE-CHANNEL SPEECH ENHANCEMENT ALGORITHM
Yong, Pei Chee
Nordholm, Sven
Dam, Hai Huyen
Leung, Yee Hong
Lai, Chiong Ching
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7284 - 7288
[40] Joint Speech Enhancement and Speaker Identification Using Monte Carlo Methods
Maina, Ciira Wa
Walsh, John MacLaren
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1359 - 1362

← 1 2 3 4 5 →