Text-aware Speech Separation for Multi-talker Keyword Spotting

被引:0
|
作者
Li, Haoyu [1 ]
Yang, Baochen [1 ]
Xi, Yu [1 ]
Yu, Linfeng [1 ]
Tan, Tian [1 ]
Li, Hao [2 ]
Yu, Kai [1 ]
机构
[1] Shanghai Jiao Tong Univ, MoE Key Lab Artificial Intelligence, AI Inst, X LANCE Lab, Shanghai, Peoples R China
[2] AISpeech Ltd, Beijing, Peoples R China
来源
关键词
multi-talker keyword spotting; text-aware speech separation; robustness;
D O I
10.21437/Interspeech.2024-789
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For noisy environments, ensuring the robustness of keyword spotting (KWS) systems is essential. While much research has focused on noisy KWS, less attention has been paid to multi-talker mixed speech scenarios. Unlike the usual cocktail party problem where multi-talker speech is separated using speaker clues, the key challenge here is to extract the target speech for KWS based on text clues. To address it, this paper proposes a novel Text-aware Permutation Determinization Training method for multi-talker KWS with a clue-based Speech Separation front-end (TPDT-SS). Our research highlights the critical role of SS front-ends and shows that incorporating keyword-specific clues into these models can greatly enhance the effectiveness. TPDT-SS shows remarkable success in addressing permutation problems in mixed keyword speech, thereby greatly boosting the performance of the backend. Additionally, fine-tuning our system on unseen mixed speech results in further performance improvement.
引用
收藏
页码:337 / 341
页数:5
相关论文
共 50 条
  • [1] A two-stage phase-aware approach for monaural multi-talker speech separation
    Yin L.
    Li J.
    Yan Y.
    Akagi M.
    IEICE Transactions on Information and Systems, 2020, E103.D (07): : 1732 - 1743
  • [2] A Two-Stage Phase-Aware Approach for Monaural Multi-Talker Speech Separation
    Yin, Lu
    Li, Junfeng
    Yan, Yonghong
    Akagi, Masato
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (07): : 1732 - 1743
  • [3] Multi-talker Speech Separation Based on Permutation Invariant Training and Beamforming
    Yin, Lu
    Wang, Ziteng
    Xia, Risheng
    Li, Junfeng
    Yan, Yonghong
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 851 - 855
  • [4] STREAMING NOISE CONTEXT AWARE ENHANCEMENT FOR AUTOMATIC SPEECH RECOGNITION IN MULTI-TALKER ENVIRONMENTS
    Caroselli, Joe
    Narayanan, Arun
    Huang, Yiteng
    2022 INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC 2022), 2022,
  • [5] MULTI-MICROPHONE NEURAL SPEECH SEPARATION FOR FAR-FIELD MULTI-TALKER SPEECH RECOGNITION
    Yoshioka, Takuya
    Erdogan, Hakan
    Chen, Zhuo
    Alleva, Fil
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5739 - 5743
  • [6] A microphone array beamforming-based system for multi-talker speech separation
    Hidri, Adel
    Amiri, Hamid
    INTERNATIONAL JOURNAL OF SIGNAL AND IMAGING SYSTEMS ENGINEERING, 2016, 9 (4-5) : 209 - 217
  • [7] Which Ones Are Speaking? Speaker-inferred Model for Multi-talker Speech Separation
    Shi, Jing
    Xu, Jiaming
    Xu, Bo
    INTERSPEECH 2019, 2019, : 4609 - 4613
  • [8] Supervised Single-Microphone Multi-Talker Speech Separation with Conditional Random Fields
    Yeung, Yu Ting
    Lee, Tan
    Leung, Cheung-Chi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (12) : 2334 - 2342
  • [9] Recognizing Multi-talker Speech with Permutation Invariant Training
    Yu, Dong
    Chang, Xuankai
    Qian, Yanmin
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2456 - 2460
  • [10] Super-Human Multi-Talker Speech Recognition: The IBM 2006 Speech Separation Challenge System
    Kristjansson, T.
    Hershey, J.
    Olsen, P.
    Rennie, S.
    Gopinath, R.
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 97 - 100