Assessment of Single-Channel Speech Enhancement Techniques for Speaker Identification under Mismatched Conditions

被引：0

作者：

Sadjadi, Seyed Omid ^{[1
]}

Hansen, John H. L. ^{[1
]}

机构：

[1] Univ Texas Dallas, CRSS, Dallas, TX 75230 USA

来源：

11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4 | 2010年

关键词：

feature extraction; gammatone filterbank; Hilbert envelope; speaker identification; speech enhancement; RECOGNITION; SIGNAL;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

It is well known that MFCC based speaker identification (SID) systems easily break down under mismatched training and test conditions. In this paper, we report on a study that considers four different single-channel speech enhancement front-ends for robust SID under such conditions. Speech files from the YOHO database are corrupted with four types of noise including babble, car, factory, and white Gaussian at five SNR levels (0-20 dB), and processed using four speech enhancement techniques representing distinct classes of algorithms: spectral subtraction, statistical model-based, subspace, and Wiener filtering. Both processed and unprocessed files are submitted to a SID system trained on clean data. In addition, a new set of acoustic feature parameters based on Hilbert envelope of gammatone filterbank outputs are proposed and evaluated for SID task. Experimental results indicate that: (i) depending on the noise type and SNR level, the enhancement front-ends may help or hurt SID performance, (ii) the proposed feature significantly achieves higher SID accuracy compared to MFCCs under mismatched conditions.

引用

页码：2138 / 2141

页数：4

共 50 条

[41] Joint Speech Enhancement and Speaker Identification Using Approximate Bayesian Inference
Maina, Ciira Wa
Walsh, John MacLaren
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (06): : 1517 - 1529
[42] Single-channel dereverberation by feature mapping using cascade neural networks for robust distant speaker identification and speech recognition
Nugraha, Aditya Arie
Yamamoto, Kazumasa
Nakagawa, Seiichi
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014,
[43] Performance of single-channel speech enhancement algorithms on Mandarin listeners with different immersion conditions in New Zealand English
Zhang, Yunqi C.
Hioka, Yusuke
Hui, C. T. Justine
Watson, Catherine I.
SPEECH COMMUNICATION, 2024, 157
[44] Single-channel speech enhancement method using reconstructive NMF with spectrotemporal speech presence probabilities
Lee, Seongjae
Han, David K.
Ko, Hanseok
APPLIED ACOUSTICS, 2017, 117 : 257 - 262
[45] Multiframe Maximum Likelihood Distortionless Response Filter for Single-Channel Speech Enhancement
Zhao, Qingying
Chen, Zhe
Yin, Fuliang
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73
[46] NOISE ROBUST EXEMPLAR MATCHING WITH COUPLED DICTIONARIES FOR SINGLE-CHANNEL SPEECH ENHANCEMENT
Yilmaz, Emre
Baby, Deepak
Van Hamme, Hugo
2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 874 - 878
[47] FPGA Implementation of a Phase-Aware Single-Channel Speech Enhancement System
Suman Samui
Pragya Sahu
Indrajit Chakrabarti
Soumya K. Ghosh
Circuits, Systems, and Signal Processing, 2017, 36 : 4688 - 4715
[48] Single-channel speech enhancement method based on masking properties and minimum statistics
Jiang Xiaoping
Journal of Systems Engineering and Electronics, 2004, (02) : 217 - 224
[49] PHASE ESTIMATION IN SINGLE-CHANNEL SPEECH ENHANCEMENT USING PHASE INVARIANCE CONSTRAINTS
Pirolt, Michael
Stahl, Johannes
Mowlaee, Pejman
Vorobiov, Vasili I.
Barysenka, Siarhei Y.
Davydov, Andrew G.
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5585 - 5589
[50] NOISE-ADAPTIVE DEEP NEURAL NETWORK FOR SINGLE-CHANNEL SPEECH ENHANCEMENT
Chung, Hanwook
Kim, Taesup
Plourde, Eric
Champagne, Benoit
2018 IEEE 28TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2018,

← 1 2 3 4 5 →