Robust Keyword Spotting for Noisy Environments by Leveraging Speech Enhancement and Speech Presence Probability

被引:1
|
作者
Yang, Chouchang [1 ]
Saidutta, Yashas Malur [1 ]
Srinivasa, Rakshith Sharma [1 ]
Lee, Ching-Hua [1 ]
Shen, Yilin [1 ]
Jin, Hongxia [1 ]
机构
[1] Samsung Res Amer, Mountain View, CA 94043 USA
来源
INTERSPEECH 2023 | 2023年
关键词
keyword spotting; speech commands; speech presence probability; noise robust; speech enhancement;
D O I
10.21437/Interspeech.2023-2222
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Although various deep keyword spotting (KWS) systems have demonstrated promising performance under relatively noiseless environments, accurate keyword detection in the presence of strong noise remains challenging. Room acoustics and noise conditions can be highly diverse, leading to drastic performance degradation if not handled carefully. In this paper, we propose a noise management front-end called SE-SPP Net performing speech enhancement (SE) and speech presence probability (SPP) estimation jointly for robust KWS in noise. The SE-SPP Net estimates both the denoised Mel spectrogram and the position of the speech utterance in the noisy signal, where the latter is estimated as the probability of a particular time-frequency bin containing speech. Further, it comes at relatively no cost in model size when compared to a model estimating the denoised speech. Our SE-SPP Net can improve noisy KWS performance by up to 7% compared to a similar sized state-of-the-art model at SNR -10dB.
引用
收藏
页码:1638 / 1642
页数:5
相关论文
共 50 条
  • [1] A robust speech enhancement method in noisy environments
    Abajaddi, Nesrine
    Mounir, Badia
    Elfahm, Youssef
    Farchi, Abdelmajid
    INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2023, 14 (09) : 973 - 983
  • [2] A spatio-temporal speech enhancement scheme for robust speech recognition in noisy environments
    Visser, E
    Otsuka, M
    Lee, TW
    SPEECH COMMUNICATION, 2003, 41 (2-3) : 393 - 407
  • [3] TE-KWS: Text-Informed Speech Enhancement for Noise-Robust Keyword Spotting
    Liu, Dong
    Mao, Qirong
    Gao, Lijian
    Ren, Qinghua
    Chen, Zhenghan
    Dong, Ming
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 601 - 610
  • [4] Wavelet based speech presence probability estimator for speech enhancement
    Lun, Daniel Pak-Kong
    Shen, Tak-Wai
    Hsung, Tai-Chiu
    Ho, Dominic K. C.
    DIGITAL SIGNAL PROCESSING, 2012, 22 (06) : 1161 - 1173
  • [5] Unsupervised Speech Enhancement Using Optimal Transport and Speech Presence Probability
    Jiang, Wenbin
    Yu, Kai
    Wen, Fei
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 4445 - 4455
  • [6] Fast Keyword Spotting in Telephone Speech
    Nouza, Jan
    Silovsky, Jan
    RADIOENGINEERING, 2009, 18 (04) : 665 - 670
  • [7] Robust Dual-Modal Speech Keyword Spotting for XR Headsets
    Cai, Zhuojiang
    Ma, Yuhan
    Lu, Feng
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, 30 (05) : 2507 - 2516
  • [8] SPEECH PRESENCE PROBABILITY ESTIMATION BASED ON INTEGRATED TIME-FREQUENCY MINIMUM TRACKING FOR SPEECH ENHANCEMENT IN ADVERSE ENVIRONMENTS
    Fu, Zhong-Hua
    Wang, Jhing-Fa
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4258 - 4261
  • [9] Adaptive Threshold for Speech Enhancement in Nonstationary Noisy Environments
    Lee, Soo-Jeong
    Kim, Sun-Hyob
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2008, 27 (07): : 386 - 393
  • [10] Comparison of Keyword Spotting Methods for Searching in Speech
    Smidl, Lubos
    Psutka, Josef V.
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1894 - 1897