Speaker Re-identification with Speaker Dependent Speech Enhancement

被引:3
|
作者
Shi, Yanpei [1 ]
Huang, Qiang [1 ]
Hain, Thomas [1 ]
机构
[1] Univ Sheffield, Dept Comp Sci, Speech & Hearing Res Grp, Sheffield, S Yorkshire, England
来源
INTERSPEECH 2020 | 2020年
基金
“创新英国”项目;
关键词
Speech Enhancement; Speaker Identification; Speaker Verification; Noise Robustness; NOISY;
D O I
10.21437/Interspeech.2020-1772
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
While the use of deep neural networks has significantly boosted speaker recognition performance, it is still challenging to separate speakers in poor acoustic environments. Here speech enhancement methods have traditionally allowed improved performance. The recent works have shown that adapting speech enhancement can lead to further gains. This paper introduces a novel approach that cascades speech enhancement and speaker recognition. In the first step, a speaker embedding vector is generated, which is used in the second step to enhance the speech quality and re-identify the speakers. Models are trained in an integrated framework with joint optimisation. The proposed approach is evaluated using the Voxceleb1 dataset, which aims to assess speaker recognition in real world situations. In addition three types of noise at different signal-noise-ratios were added for this work. The obtained results show that the proposed approach using speaker dependent speech enhancement can yield better speaker recognition and speech enhancement performances than two baselines in various noise conditions.
引用
收藏
页码:1530 / 1534
页数:5
相关论文
共 50 条
  • [31] Robust several-speaker speech recognition with highly dependable online speaker adaptation and identification
    Shih, Po-Yi
    Lin, Po-Chuan
    Wang, Jhing-Fa
    Lin, Yuan-Ning
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2011, 34 (05) : 1459 - 1467
  • [32] Combining missing-feature theory, speech enhancement, and speaker-dependent/-independent modeling for speech separation
    Ming, Ji
    Hazen, Timothy J.
    Glass, James R.
    COMPUTER SPEECH AND LANGUAGE, 2010, 24 (01): : 67 - 76
  • [33] A unified DNN approach to speaker-dependent simultaneous speech enhancement and speech separation in low SNR environments
    Gao, Tian
    Du, Jun
    Dai, Li-Rong
    Lee, Chin-Hui
    SPEECH COMMUNICATION, 2017, 95 : 28 - 39
  • [34] Combining Missing-Feature Theory, Speech Enhancement and Speaker-Dependent/-Independent Modeling for Speech Separation
    Ming, Ji
    Hazen, Timothy J.
    Glass, James R.
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 93 - +
  • [35] Speaker Identification Based on Physical Variation of Speech Signal
    Nandan, Durgesh
    Singh, Mahesh Kumar
    Kumar, Sanjeev
    Yadav, Harendra Kumar
    TRAITEMENT DU SIGNAL, 2022, 39 (02) : 711 - 716
  • [36] SPEAKER IDENTIFICATION FROM SHOUTED SPEECH: ANALYSIS AND COMPENSATION
    Hanilci, Cemal
    Kinnunen, Tomi
    Saeidi, Rahim
    Pohjalainen, Jouni
    Alku, Paavo
    Ertas, Figen
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8027 - 8031
  • [37] Speaker Identification Within Whispered Speech Audio Streams
    Fan, Xing
    Hansen, John H. L.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (05): : 1408 - 1421
  • [38] Speech Coding Influence on Features Dedicated to Speaker Identification
    Maka, Tomasz
    Bonikowski, Lukasz
    ICSES 2008 INTERNATIONAL CONFERENCE ON SIGNALS AND ELECTRONIC SYSTEMS, CONFERENCE PROCEEDINGS, 2008, : 489 - 492
  • [39] Noise robust speaker identification for spontaneous Arabic speech
    Graciarena, Martin
    Kajarekar, Sachin
    Stolcke, Andreas
    Shriberg, Elizabeth
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 245 - +
  • [40] SPEAKER IDENTIFICATION IN LOW-RATE CODED SPEECH
    Catellier, Andrew
    Voran, Stephen
    MEASUREMENT OF SPEECH, AUDIO AND VIDEO QUALITY IN NETWORKS, 2008, : 27 - 36