Robust Keyword Spotting for Noisy Environments by Leveraging Speech Enhancement and Speech Presence Probability

被引：1

作者：

Yang, Chouchang ^{[1
]}

Saidutta, Yashas Malur ^{[1
]}

Srinivasa, Rakshith Sharma ^{[1
]}

Lee, Ching-Hua ^{[1
]}

Shen, Yilin ^{[1
]}

Jin, Hongxia ^{[1
]}

机构：

[1] Samsung Res Amer, Mountain View, CA 94043 USA

来源：

INTERSPEECH 2023 | 2023年

关键词：

keyword spotting; speech commands; speech presence probability; noise robust; speech enhancement;

D O I：

10.21437/Interspeech.2023-2222

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Although various deep keyword spotting (KWS) systems have demonstrated promising performance under relatively noiseless environments, accurate keyword detection in the presence of strong noise remains challenging. Room acoustics and noise conditions can be highly diverse, leading to drastic performance degradation if not handled carefully. In this paper, we propose a noise management front-end called SE-SPP Net performing speech enhancement (SE) and speech presence probability (SPP) estimation jointly for robust KWS in noise. The SE-SPP Net estimates both the denoised Mel spectrogram and the position of the speech utterance in the noisy signal, where the latter is estimated as the probability of a particular time-frequency bin containing speech. Further, it comes at relatively no cost in model size when compared to a model estimating the denoised speech. Our SE-SPP Net can improve noisy KWS performance by up to 7% compared to a similar sized state-of-the-art model at SNR -10dB.

引用

页码：1638 / 1642

页数：5

共 50 条

[31] Deep Neural Networks for Speech Enhancement in Complex-Noisy Environments
Saleem, Nasir
Khattak, Muhammad Irfan
INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2020, 6 (01): : 84 - 90
[32] Speech enhancement of instantaneous amplitude and phase for applications in noisy reverberant environments
Liu, Yang
Nower, Naushin
Morita, Shota
Unoki, Masashi
SPEECH COMMUNICATION, 2016, 84 : 1 - 14
[33] POST-FILTER DESIGN FOR SPEECH ENHANCEMENT IN VARIOUS NOISY ENVIRONMENTS
Niwa, Kenta
Hioka, Yusuke
Kobayashi, Kazunori
2014 14TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2014, : 35 - 39
[34] Speech Enhancement Algorithm Based on a Convolutional Neural Network Reconstruction of the Temporal Envelope of Speech in Noisy Environments
Soleymanpour, Rahim
Soleymanpour, Mohammad
Brammer, Anthony J.
Johnson, Michael T.
Kim, Insoo
IEEE ACCESS, 2023, 11 : 5328 - 5336
[35] An Anchor-Free Detector for Continuous Speech Keyword Spotting
Zhao, Zhiyuan
Tang, Chuanxin
Yao, Chengdong
Luo, Chong
INTERSPEECH 2022, 2022, : 3228 - 3232
[36] Keyword spotting in continuous speech using convolutional neural network
Rostami, Amir Mohammad
Karimi, Ali
Akhaee, Mohammad Ali
SPEECH COMMUNICATION, 2022, 142 : 15 - 21
[37] Keyword Spotting for Google Assistant Using Contextual Speech Recognition
Michaely, Assaf Hurwitz
Zhang, Xuedong
Simko, Gabor
Parada, Carolina
Aleksic, Petar
2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 272 - 278
[38] Keyword Spotting Based On CTC and RNN For Mandarin Chinese Speech
Wang, Yiyan
Long, Yanhua
2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 374 - 378
[39] A multichannel subspace approach with signal presence probability for speech enhancement
Jungpyo Hong
Multidimensional Systems and Signal Processing, 2019, 30 : 2045 - 2058
[40] THE 2013 BBN VIETNAMESE TELEPHONE SPEECH KEYWORD SPOTTING SYSTEM
Tsakalidis, Stavros
Hsiao, Roger
Karakos, Damianos
Ng, Tim
Ranjan, Shivesh
Saikumar, Guruprasad
Zhang, Le
Nguyen, Long
Schwartz, Richard
Makhoul, John
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,

← 1 2 3 4 5 →