A PROGRESSIVE LEARNING APPROACH TO ADAPTIVE NOISE AND SPEECH ESTIMATION FOR SPEECH ENHANCEMENT AND NOISY SPEECH RECOGNITION

被引：9

作者：

Nian, Zhaoxu ^{[1
]}

Tu, Yan-Hui ^{[1
]}

Du, Jun ^{[1
]}

Lee, Chin-Hui ^{[2
]}

机构：

[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China

[2] Georgia Inst Technol, Atlanta, GA 30332 USA

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年

基金：

国家重点研发计划;

关键词：

Speech recognition; speech enhancement; progressive learning; improved minima controlled recursive averaging; adaptive noise and speech estimation;

D O I：

10.1109/ICASSP39728.2021.9413395

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we propose a progressive learning-based adaptive noise and speech estimation (PL-ANSE) method for speech preprocessing in noisy speech recognition, leveraging upon a frame-level noise tracking capability of improved minima controlled recursive averaging (IMCRA) and an utterance-level deep progressive learning of nonlinear interactions between speech and noise. First, a bi-directional long short-term memory model is adopted at each network layer to learn progressive ratio masks (PRMs) as targets with progressively increasing signal-to-noise ratios. Then, the estimated PRMs at the utterance level are combined within a conventional speech enhancement algorithm at the frame level for speech enhancement. Finally, the enhanced speech based on multi-level information fusion is directly fed into a speech recognition system to improve the recognition performance. Experiments show that our proposed approach can achieve a relative word error rate (WER) reduction of 22.1% when compared to results attained with unprocessed noisy speech (from 23.84% to 18.57%) on the CHiME-4 single-channel real test data.

引用

页码：6913 / 6917

页数：5

共 50 条

[21] A Joint Speech Enhancement and Self-Supervised Representation Learning Framework for Noise-Robust Speech Recognition
Zhu, Qiu-Shi
Zhang, Jie
Zhang, Zi-Qiang
Dai, Li-Rong
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 1927 - 1939
[22] REINFORCEMENT LEARNING BASED SPEECH ENHANCEMENT FOR ROBUST SPEECH RECOGNITION
Shen, Yih-Liang
Huang, Chao-Yuan
Wang, Syu-Siang
Tsao, Yu
Wang, Hsin-Min
Chi, Tai-Shih
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6750 - 6754
[23] A Noise Robust Speech Recognition Method Using Model Compensation Based on Speech Enhancement
Shen, Guanghu
Jung, Ho-Youl
Chung, Hyun-Yeol
JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2008, 27 (04): : 191 - 199
[24] A Review of Signal Subspace Speech Enhancement and Its Application to Noise Robust Speech Recognition
Kris Hermus
Patrick Wambacq
Hugo Van hamme
EURASIP Journal on Advances in Signal Processing, 2007
[25] Learning Noise Adapters for Incremental Speech Enhancement
Yang, Ziye
Song, Xiang
Chen, Jie
Richard, Cedric
Cohen, Israel
IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2915 - 2919
[26] Word graph based feature enhancement for noisy speech recognition
Yan, Zhi-Jie
Soong, Frank K.
Wang, Ren-Hua
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 373 - +
[27] A speech enhancement approach based on noise classification
Yuan, Wenhao
Xia, Bin
APPLIED ACOUSTICS, 2015, 96 : 11 - 19
[28] A spatio-temporal speech enhancement scheme for robust speech recognition in noisy environments
Visser, E
Otsuka, M
Lee, TW
SPEECH COMMUNICATION, 2003, 41 (2-3) : 393 - 407
[29] A TIME DOMAIN PROGRESSIVE LEARNING APPROACH WITH SNR CONSTRICTION FOR SINGLE-CHANNEL SPEECH ENHANCEMENT AND RECOGNITION
Nian, Zhaoxu
Du, Jun
Yeung, Yu Ting
Wang, Renyu
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6277 - 6281
[30] Two-Stage Enhancement of Noisy and Reverberant Microphone Array Speech for Automatic Speech Recognition Systems Trained with Only Clean Speech
Wang, Quandong
Wang, Sicheng
Ge, Fengpei
Han, Chang Woo
Lee, Jaewon
Guo, Lianghao
Lee, Chin-Hui
2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 21 - 25

← 1 2 3 4 5 →