Robust Keyword Spotting for Noisy Environments by Leveraging Speech Enhancement and Speech Presence Probability

被引:1
|
作者
Yang, Chouchang [1 ]
Saidutta, Yashas Malur [1 ]
Srinivasa, Rakshith Sharma [1 ]
Lee, Ching-Hua [1 ]
Shen, Yilin [1 ]
Jin, Hongxia [1 ]
机构
[1] Samsung Res Amer, Mountain View, CA 94043 USA
来源
INTERSPEECH 2023 | 2023年
关键词
keyword spotting; speech commands; speech presence probability; noise robust; speech enhancement;
D O I
10.21437/Interspeech.2023-2222
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Although various deep keyword spotting (KWS) systems have demonstrated promising performance under relatively noiseless environments, accurate keyword detection in the presence of strong noise remains challenging. Room acoustics and noise conditions can be highly diverse, leading to drastic performance degradation if not handled carefully. In this paper, we propose a noise management front-end called SE-SPP Net performing speech enhancement (SE) and speech presence probability (SPP) estimation jointly for robust KWS in noise. The SE-SPP Net estimates both the denoised Mel spectrogram and the position of the speech utterance in the noisy signal, where the latter is estimated as the probability of a particular time-frequency bin containing speech. Further, it comes at relatively no cost in model size when compared to a model estimating the denoised speech. Our SE-SPP Net can improve noisy KWS performance by up to 7% compared to a similar sized state-of-the-art model at SNR -10dB.
引用
收藏
页码:1638 / 1642
页数:5
相关论文
共 50 条
  • [21] SESNet: A Speech Enhancement and Separation Network in Noisy Reverberant Environments
    Wang, Liusong
    Gao, Yuan
    Cao, Kaimin
    Hu, Ying
    MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2024, 2025, 2312 : 44 - 54
  • [22] Speech Enhancement Based on Teacher-Student Deep Learning Using Improved Speech Presence Probability for Noise-Robust Speech Recognition
    Tu, Yan-Hui
    Du, Jun
    Lee, Chin-Hui
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (12) : 2080 - 2091
  • [23] Utterance verification for spontaneous mandarin speech keyword spotting
    Xin, L
    Wang, BX
    2001 INTERNATIONAL CONFERENCES ON INFO-TECH AND INFO-NET PROCEEDINGS, CONFERENCE A-G: INFO-TECH & INFO-NET: A KEY TO BETTER LIFE, 2001, : C397 - C401
  • [24] USING WEB TEXT TO IMPROVE KEYWORD SPOTTING IN SPEECH
    Gandhe, Ankur
    Qin, Long
    Metze, Florian
    Rudnicky, Alexander
    Lane, Ian
    Eck, Matthias
    2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2013, : 428 - 433
  • [25] Speech Augmentation Based Unsupervised Learning for Keyword Spotting
    Luo, Jian
    Wang, Jianzong
    Cheng, Ning
    Tang, Haobin
    Xiao, Jing
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [26] Noisy student-teacher training for robust keyword spotting
    Park, Hyun-Jin
    Zhu, Pai
    Moreno, Ignacio Lopez
    Subrahmanya, Niranjan
    INTERSPEECH 2021, 2021, : 331 - 335
  • [27] Speech enhancement method based on feature compensation gain for effective speech recognition in noisy environments
    Bae, Ara
    Kim, Wooil
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2019, 38 (01): : 51 - 55
  • [28] Text-Dependent Speech Enhancement for Small-Footprint Robust Keyword Detection
    Yu, Meng
    Ji, Xuan
    Gao, Yi
    Chen, Lianwu
    Chen, Jie
    Zheng, Jimeng
    Su, Dan
    Yu, Dong
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2613 - 2617
  • [29] Utilization of Exponentiated Amplitude Spectrum for Speech Enhancement in Highly Noisy Environments
    Abe, Emiko
    Shimamura, Tetsuya
    2015 IEEE 58TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS), 2015,
  • [30] A Speech Enhancement Front-End for Intent Classification in Noisy Environments
    Ali, Mohamed Nabih
    Schmalz, Veronica Juliana
    Brutti, Alessio
    Falavigna, Daniele
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 471 - 475