Noise robust automatic speech recognition: review and analysis

被引:3
|
作者
Dua M. [1 ]
Akanksha [1 ]
Dua S. [2 ]
机构
[1] Department of Computer Engineering, National Institute of Technology, Kurukshetra
[2] Department of Electronics and Communication Engineering, National Institute of Technology, Kurukshetra
关键词
Acoustic modeling; Feature extraction; Noise robust ASR; Word error rate (WER);
D O I
10.1007/s10772-023-10033-0
中图分类号
学科分类号
摘要
Automatic Speech Recognition (ASR) system is an emerging technology used in various fields such as robotics, traffic controls, and healthcare, etc. The leading cause of ASR performance degradation is mismatch between the training and testing environments. The main reason for this mismatch is the presence of noise during the testing phase of an ASR system. Various techniques have been used by different researchers in front and backend phases of ASR, to detect and handle the noise. However, a very few review papers have considered noise as a criterion to present the comparison among the existing research works. Hence, the objective of this survey is to analyze and review all the effective methods proposed by different scientists and researchers to boost the noise robustness of an ASR system. Initially, the paper discusses the basic architecture of an ASR system, the factors affecting the its performance, and noise problem formulation. Secondly, the work analysis existing state of the art noise robust ASR methods in terms of front end feature extraction techniques and backend classification model. Then, a detailed review in terms of various speech databases, that are used by these methods, is given. Finally, an analysis in terms of performance metrics of all these noise-resistant ASR techniques is presented. Also, the paper discusses various feature extraction techniques, backend classification methods, different speech databases and performance metrics in detail, while presenting the analysis. The paper also discusses the existing challenges, and describes future research directions in the area of building noise-resistant ASR systems. © 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
引用
收藏
页码:475 / 519
页数:44
相关论文
共 50 条
  • [31] Noise robust automatic speech recognition with adaptive quantile based noise estimation and speech band emphasizing filter bank
    Bonde, CS
    Graversen, C
    Gregersen, AG
    Ngo, KH
    Normark, K
    Purup, M
    Thorsen, T
    Lindberg, B
    NONLINEAR ANALYSES AND ALGORITHMS FOR SPEECH PROCESSING, 2005, 3817 : 291 - 302
  • [32] A Review of Signal Subspace Speech Enhancement and Its Application to Noise Robust Speech Recognition
    Kris Hermus
    Patrick Wambacq
    Hugo Van hamme
    EURASIP Journal on Advances in Signal Processing, 2007
  • [33] A review of signal subspace speech enhancement and its application to noise robust speech recognition
    Hermus, Kris
    Wambacq, Patrick
    Van hamme, Hugo
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2007, 2007 (1)
  • [34] Automatic speech recognition: A review
    Haton, JP
    ENTERPRISE INFORMATION SYSTEMS V, 2004, : 6 - 11
  • [35] Knowledge Distillation-Based Training of Speech Enhancement for Noise-Robust Automatic Speech Recognition
    Woo Lee, Geon
    Kook Kim, Hong
    Kong, Duk-Jo
    IEEE ACCESS, 2024, 12 : 72707 - 72720
  • [36] Noise suppression based on wavelet packet decomposition and quantile noise estimation for robust automatic speech recognition
    Rank, Erhard
    Van Pham, Tuan
    Kubin, Gernot
    2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 477 - 480
  • [37] NOISE ADAPTIVE TRAINING USING A VECTOR TAYLOR SERIES APPROACH FOR NOISE ROBUST AUTOMATIC SPEECH RECOGNITION
    Kalinli, Ozlem
    Seltzer, Michael L.
    Acero, Alex
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3825 - 3828
  • [38] Noise-robust automatic speech recognition using a predictive echo state network
    Skowronski, Mark D.
    Harris, John G.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (05): : 1724 - 1730
  • [39] A Novel Model Characteristics for Noise-Robust Automatic Speech Recognition Based on HMM
    Rafieee, M. Saadeq
    Khazaei, Ali Akbar
    2010 IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND INFORMATION SECURITY (WCNIS), VOL 2, 2010, : 215 - 218
  • [40] Noise-robust automatic speech recognition using a discriminative echo state network
    Skowronski, Mark D.
    Harris, John G.
    2007 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, 2007, : 1771 - 1774