Robust Keyword Spotting for Noisy Environments by Leveraging Speech Enhancement and Speech Presence Probability

被引:1
|
作者
Yang, Chouchang [1 ]
Saidutta, Yashas Malur [1 ]
Srinivasa, Rakshith Sharma [1 ]
Lee, Ching-Hua [1 ]
Shen, Yilin [1 ]
Jin, Hongxia [1 ]
机构
[1] Samsung Res Amer, Mountain View, CA 94043 USA
来源
INTERSPEECH 2023 | 2023年
关键词
keyword spotting; speech commands; speech presence probability; noise robust; speech enhancement;
D O I
10.21437/Interspeech.2023-2222
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Although various deep keyword spotting (KWS) systems have demonstrated promising performance under relatively noiseless environments, accurate keyword detection in the presence of strong noise remains challenging. Room acoustics and noise conditions can be highly diverse, leading to drastic performance degradation if not handled carefully. In this paper, we propose a noise management front-end called SE-SPP Net performing speech enhancement (SE) and speech presence probability (SPP) estimation jointly for robust KWS in noise. The SE-SPP Net estimates both the denoised Mel spectrogram and the position of the speech utterance in the noisy signal, where the latter is estimated as the probability of a particular time-frequency bin containing speech. Further, it comes at relatively no cost in model size when compared to a model estimating the denoised speech. Our SE-SPP Net can improve noisy KWS performance by up to 7% compared to a similar sized state-of-the-art model at SNR -10dB.
引用
收藏
页码:1638 / 1642
页数:5
相关论文
共 50 条
  • [41] A multichannel subspace approach with signal presence probability for speech enhancement
    Hong, Jungpyo
    MULTIDIMENSIONAL SYSTEMS AND SIGNAL PROCESSING, 2019, 30 (04) : 2045 - 2058
  • [42] Enhancement of Noisy Speech using Sub-band Harmonic Regeneration and Speech Presence Uncertainty Estimator
    Kumar, Ravi
    Subbaiah, P. V.
    2016 IEEE INTERNATIONAL CONFERENCE ON RECENT TRENDS IN ELECTRONICS, INFORMATION & COMMUNICATION TECHNOLOGY (RTEICT), 2016, : 456 - 460
  • [43] Codebook-Based Speech Enhancement Using Markov Process and Speech-presence Probability
    He, Qi
    Bao, Chang-chun
    Bao, Feng
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1780 - 1784
  • [44] SPEECH RECOGNITION IN NOISY ENVIRONMENTS - A SURVEY
    GONG, YF
    SPEECH COMMUNICATION, 1995, 16 (03) : 261 - 291
  • [45] Speech Enhancement for Nonstationary Noise Environments
    Zhang, Qiquan
    Wang, Mingjiang
    2017 17TH IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY (ICCT 2017), 2017, : 1663 - 1667
  • [46] Applied principles of clear and Lombard speech for automated intelligibility enhancement in noisy environments
    Skowronski, MD
    Harris, JG
    SPEECH COMMUNICATION, 2006, 48 (05) : 549 - 558
  • [47] Bark scaled oversampled WPT based speech recognition enhancement in noisy environments
    Navneet Upadhyay
    Hamurabi Gamboa Rosales
    International Journal of Speech Technology, 2020, 23 : 1 - 12
  • [48] Blind Extraction-Based Multichannel Speech Enhancement in Noisy and Reverberation Environments
    Xie, Yuan
    Zou, Tao
    Sun, Weijun
    Xie, Shengli
    IEEE SENSORS LETTERS, 2025, 9 (03)
  • [49] Bark scaled oversampled WPT based speech recognition enhancement in noisy environments
    Upadhyay, Navneet
    Rosales, Hamurabi Gamboa
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (01) : 1 - 12
  • [50] A SPEECH PRESENCE MICROPHONE ARRAY BEAMFORMER USING MODEL BASED SPEECH PRESENCE PROBABILITY ESTIMATION
    Yu, Tao
    Hansen, John H. L.
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 213 - 216