Assessment of Single-Channel Speech Enhancement Techniques for Speaker Identification under Mismatched Conditions

被引：0

作者：

Sadjadi, Seyed Omid ^{[1
]}

Hansen, John H. L. ^{[1
]}

机构：

[1] Univ Texas Dallas, CRSS, Dallas, TX 75230 USA

来源：

11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4 | 2010年

关键词：

feature extraction; gammatone filterbank; Hilbert envelope; speaker identification; speech enhancement; RECOGNITION; SIGNAL;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

It is well known that MFCC based speaker identification (SID) systems easily break down under mismatched training and test conditions. In this paper, we report on a study that considers four different single-channel speech enhancement front-ends for robust SID under such conditions. Speech files from the YOHO database are corrupted with four types of noise including babble, car, factory, and white Gaussian at five SNR levels (0-20 dB), and processed using four speech enhancement techniques representing distinct classes of algorithms: spectral subtraction, statistical model-based, subspace, and Wiener filtering. Both processed and unprocessed files are submitted to a SID system trained on clean data. In addition, a new set of acoustic feature parameters based on Hilbert envelope of gammatone filterbank outputs are proposed and evaluated for SID task. Experimental results indicate that: (i) depending on the noise type and SNR level, the enhancement front-ends may help or hurt SID performance, (ii) the proposed feature significantly achieves higher SID accuracy compared to MFCCs under mismatched conditions.

引用

页码：2138 / 2141

页数：4

共 50 条

[21] Modified Amplitude Spectral Estimator for Single-Channel Speech Enhancement
Zhai, Zhenhui
Ou, Shifeng
Gao, Ying
PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ADVANCES IN MECHANICAL ENGINEERING AND INDUSTRIAL INFORMATICS (AMEII 2016), 2016, 73 : 1115 - 1120
[22] Single-Channel Online Enhancement of Speech Corrupted by Reverberation and Noise
Doire, Clement S. J.
Brookes, Mike
Naylor, Patrick A.
Hicks, Christopher M.
Betts, Dave
Dmour, Mohammad A.
Jensen, Soren Holdt
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (03) : 572 - 587
[23] Effectiveness of Single-Channel BLSTM Enhancement for Language Identification
Frederiksen, Peter Sibbern
Villalba, Jesus
Watanabe, Shinji
Tan, Zheng-Hua
Dehak, Najim
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1823 - 1827
[24] Speaker Re-identification with Speaker Dependent Speech Enhancement
Shi, Yanpei
Huang, Qiang
Hain, Thomas
INTERSPEECH 2020, 2020, : 1530 - 1534
[25] Robust Far-Field Speaker Identification under Mismatched Conditions
Jin, Qin
Schultz, Tanja
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1893 - 1896
[26] Single-channel multiple regression for in-car speech enhancement
Li, WF
Itou, K
Takeda, K
Itakura, F
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (03) : 1032 - 1039
[27] STFT Phase Reconstruction in Voiced Speech for an Improved Single-Channel Speech Enhancement
Krawczyk, Martin
Gerkmann, Timo
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (12) : 1931 - 1940
[28] A SPECTRAL CONVERSION BASED SINGLE-CHANNEL SINGLE-MICROPHONE SPEECH ENHANCEMENT
Huy-Khoi Do
Quang Vinh Thai
FOURTH INTERNATIONAL CONFERENCE ON COMPUTER AND ELECTRICAL ENGINEERING (ICCEE 2011), 2011, : 583 - +
[29] Glance and gaze: A collaborative learning framework for single-channel speech enhancement
Li, Andong
Zheng, Chengshi
Zhang, Lu
Li, Xiaodong
APPLIED ACOUSTICS, 2022, 187
[30] Phase Estimation in Single-Channel Speech Enhancement: Limits-Potential
Mowlaee, Pejman
Kulmer, Josef
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (08) : 1283 - 1294

← 1 2 3 4 5 →