Speaker Re-identification with Speaker Dependent Speech Enhancement

被引：3

作者：

Shi, Yanpei ^{[1
]}

Huang, Qiang ^{[1
]}

Hain, Thomas ^{[1
]}

机构：

[1] Univ Sheffield, Dept Comp Sci, Speech & Hearing Res Grp, Sheffield, S Yorkshire, England

来源：

INTERSPEECH 2020 | 2020年

基金：

“创新英国”项目;

关键词：

Speech Enhancement; Speaker Identification; Speaker Verification; Noise Robustness; NOISY;

D O I：

10.21437/Interspeech.2020-1772

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

While the use of deep neural networks has significantly boosted speaker recognition performance, it is still challenging to separate speakers in poor acoustic environments. Here speech enhancement methods have traditionally allowed improved performance. The recent works have shown that adapting speech enhancement can lead to further gains. This paper introduces a novel approach that cascades speech enhancement and speaker recognition. In the first step, a speaker embedding vector is generated, which is used in the second step to enhance the speech quality and re-identify the speakers. Models are trained in an integrated framework with joint optimisation. The proposed approach is evaluated using the Voxceleb1 dataset, which aims to assess speaker recognition in real world situations. In addition three types of noise at different signal-noise-ratios were added for this work. The obtained results show that the proposed approach using speaker dependent speech enhancement can yield better speaker recognition and speech enhancement performances than two baselines in various noise conditions.

引用

页码：1530 / 1534

页数：5

共 50 条

[31] Robust several-speaker speech recognition with highly dependable online speaker adaptation and identification
Shih, Po-Yi
Lin, Po-Chuan
Wang, Jhing-Fa
Lin, Yuan-Ning
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2011, 34 (05) : 1459 - 1467
[32] Combining missing-feature theory, speech enhancement, and speaker-dependent/-independent modeling for speech separation
Ming, Ji
Hazen, Timothy J.
Glass, James R.
COMPUTER SPEECH AND LANGUAGE, 2010, 24 (01): : 67 - 76
[33] A unified DNN approach to speaker-dependent simultaneous speech enhancement and speech separation in low SNR environments
Gao, Tian
Du, Jun
Dai, Li-Rong
Lee, Chin-Hui
SPEECH COMMUNICATION, 2017, 95 : 28 - 39
[34] Combining Missing-Feature Theory, Speech Enhancement and Speaker-Dependent/-Independent Modeling for Speech Separation
Ming, Ji
Hazen, Timothy J.
Glass, James R.
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 93 - +
[35] Speaker Identification Based on Physical Variation of Speech Signal
Nandan, Durgesh
Singh, Mahesh Kumar
Kumar, Sanjeev
Yadav, Harendra Kumar
TRAITEMENT DU SIGNAL, 2022, 39 (02) : 711 - 716
[36] SPEAKER IDENTIFICATION FROM SHOUTED SPEECH: ANALYSIS AND COMPENSATION
Hanilci, Cemal
Kinnunen, Tomi
Saeidi, Rahim
Pohjalainen, Jouni
Alku, Paavo
Ertas, Figen
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8027 - 8031
[37] Speaker Identification Within Whispered Speech Audio Streams
Fan, Xing
Hansen, John H. L.
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (05): : 1408 - 1421
[38] Speech Coding Influence on Features Dedicated to Speaker Identification
Maka, Tomasz
Bonikowski, Lukasz
ICSES 2008 INTERNATIONAL CONFERENCE ON SIGNALS AND ELECTRONIC SYSTEMS, CONFERENCE PROCEEDINGS, 2008, : 489 - 492
[39] Noise robust speaker identification for spontaneous Arabic speech
Graciarena, Martin
Kajarekar, Sachin
Stolcke, Andreas
Shriberg, Elizabeth
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 245 - +
[40] SPEAKER IDENTIFICATION IN LOW-RATE CODED SPEECH
Catellier, Andrew
Voran, Stephen
MEASUREMENT OF SPEECH, AUDIO AND VIDEO QUALITY IN NETWORKS, 2008, : 27 - 36

← 1 2 3 4 5 →