A PROGRESSIVE LEARNING APPROACH TO ADAPTIVE NOISE AND SPEECH ESTIMATION FOR SPEECH ENHANCEMENT AND NOISY SPEECH RECOGNITION

被引:9
|
作者
Nian, Zhaoxu [1 ]
Tu, Yan-Hui [1 ]
Du, Jun [1 ]
Lee, Chin-Hui [2 ]
机构
[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
[2] Georgia Inst Technol, Atlanta, GA 30332 USA
来源
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年
基金
国家重点研发计划;
关键词
Speech recognition; speech enhancement; progressive learning; improved minima controlled recursive averaging; adaptive noise and speech estimation;
D O I
10.1109/ICASSP39728.2021.9413395
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose a progressive learning-based adaptive noise and speech estimation (PL-ANSE) method for speech preprocessing in noisy speech recognition, leveraging upon a frame-level noise tracking capability of improved minima controlled recursive averaging (IMCRA) and an utterance-level deep progressive learning of nonlinear interactions between speech and noise. First, a bi-directional long short-term memory model is adopted at each network layer to learn progressive ratio masks (PRMs) as targets with progressively increasing signal-to-noise ratios. Then, the estimated PRMs at the utterance level are combined within a conventional speech enhancement algorithm at the frame level for speech enhancement. Finally, the enhanced speech based on multi-level information fusion is directly fed into a speech recognition system to improve the recognition performance. Experiments show that our proposed approach can achieve a relative word error rate (WER) reduction of 22.1% when compared to results attained with unprocessed noisy speech (from 23.84% to 18.57%) on the CHiME-4 single-channel real test data.
引用
收藏
页码:6913 / 6917
页数:5
相关论文
共 50 条
  • [31] Spectral-domain speech enhancement for speech recognition
    You, Chang Huai
    Ma, Bin
    SPEECH COMMUNICATION, 2017, 94 : 30 - 41
  • [32] DUAL APPLICATION OF SPEECH ENHANCEMENT FOR AUTOMATIC SPEECH RECOGNITION
    Pandey, Ashutosh
    Liu, Chunxi
    Wang, Yun
    Saraf, Yatharth
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 223 - 228
  • [33] An approach for speech enhancement with dysarthric speech recognition using optimization based machine learning frameworks
    Jolad B.
    Khanai R.
    International Journal of Speech Technology, 2023, 26 (02) : 287 - 305
  • [34] Learning to Inference with Early Exit in the Progressive Speech Enhancement
    Li, Andong
    Zheng, Chengshi
    Zhang, Lu
    Li, Xiaodong
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 466 - 470
  • [35] Adaptive Threshold for Speech Enhancement in Nonstationary Noisy Environments
    Lee, Soo-Jeong
    Kim, Sun-Hyob
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2008, 27 (07): : 386 - 393
  • [36] Combining DCT and Adaptive KLT for Noisy Speech Enhancement
    Ou, Shifeng
    Zhao, Xiaohui
    Dong, Jing
    2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15, 2007, : 2857 - 2860
  • [37] Noise Robust Exemplar Matching for Speech Enhancement: Applications to Automatic Speech Recognition
    Yilmaz, Emre
    Baby, Deepak
    Van Hannne, Hugo
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 688 - 692
  • [38] An adaptive KLT approach for speech enhancement
    Rezayee, A
    Gazor, S
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (02): : 87 - 95
  • [39] A new noisy speech recognition method
    Zhao, XQ
    Wang, J
    International Symposium on Communications and Information Technologies 2005, Vols 1 and 2, Proceedings, 2005, : 282 - 286
  • [40] Adaptive filtering for speech enhancement in colored noise
    Lee, KY
    Lee, BG
    Ann, S
    IEEE SIGNAL PROCESSING LETTERS, 1997, 4 (10) : 277 - 279