A PROGRESSIVE LEARNING APPROACH TO ADAPTIVE NOISE AND SPEECH ESTIMATION FOR SPEECH ENHANCEMENT AND NOISY SPEECH RECOGNITION

被引：9

作者：

Nian, Zhaoxu ^{[1
]}

Tu, Yan-Hui ^{[1
]}

Du, Jun ^{[1
]}

Lee, Chin-Hui ^{[2
]}

机构：

[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China

[2] Georgia Inst Technol, Atlanta, GA 30332 USA

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年

基金：

国家重点研发计划;

关键词：

Speech recognition; speech enhancement; progressive learning; improved minima controlled recursive averaging; adaptive noise and speech estimation;

D O I：

10.1109/ICASSP39728.2021.9413395

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we propose a progressive learning-based adaptive noise and speech estimation (PL-ANSE) method for speech preprocessing in noisy speech recognition, leveraging upon a frame-level noise tracking capability of improved minima controlled recursive averaging (IMCRA) and an utterance-level deep progressive learning of nonlinear interactions between speech and noise. First, a bi-directional long short-term memory model is adopted at each network layer to learn progressive ratio masks (PRMs) as targets with progressively increasing signal-to-noise ratios. Then, the estimated PRMs at the utterance level are combined within a conventional speech enhancement algorithm at the frame level for speech enhancement. Finally, the enhanced speech based on multi-level information fusion is directly fed into a speech recognition system to improve the recognition performance. Experiments show that our proposed approach can achieve a relative word error rate (WER) reduction of 22.1% when compared to results attained with unprocessed noisy speech (from 23.84% to 18.57%) on the CHiME-4 single-channel real test data.

引用

页码：6913 / 6917

页数：5

共 50 条

[31] Spectral-domain speech enhancement for speech recognition
You, Chang Huai
Ma, Bin
SPEECH COMMUNICATION, 2017, 94 : 30 - 41
[32] DUAL APPLICATION OF SPEECH ENHANCEMENT FOR AUTOMATIC SPEECH RECOGNITION
Pandey, Ashutosh
Liu, Chunxi
Wang, Yun
Saraf, Yatharth
2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 223 - 228
[33] An approach for speech enhancement with dysarthric speech recognition using optimization based machine learning frameworks
Jolad B.
Khanai R.
International Journal of Speech Technology, 2023, 26 (02) : 287 - 305
[34] Learning to Inference with Early Exit in the Progressive Speech Enhancement
Li, Andong
Zheng, Chengshi
Zhang, Lu
Li, Xiaodong
29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 466 - 470
[35] Adaptive Threshold for Speech Enhancement in Nonstationary Noisy Environments
Lee, Soo-Jeong
Kim, Sun-Hyob
JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2008, 27 (07): : 386 - 393
[36] Combining DCT and Adaptive KLT for Noisy Speech Enhancement
Ou, Shifeng
Zhao, Xiaohui
Dong, Jing
2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15, 2007, : 2857 - 2860
[37] Noise Robust Exemplar Matching for Speech Enhancement: Applications to Automatic Speech Recognition
Yilmaz, Emre
Baby, Deepak
Van Hannne, Hugo
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 688 - 692
[38] An adaptive KLT approach for speech enhancement
Rezayee, A
Gazor, S
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (02): : 87 - 95
[39] A new noisy speech recognition method
Zhao, XQ
Wang, J
International Symposium on Communications and Information Technologies 2005, Vols 1 and 2, Proceedings, 2005, : 282 - 286
[40] Adaptive filtering for speech enhancement in colored noise
Lee, KY
Lee, BG
Ann, S
IEEE SIGNAL PROCESSING LETTERS, 1997, 4 (10) : 277 - 279

← 1 2 3 4 5 →