Noise robust automatic speech recognition: review and analysis

被引:3
|
作者
Dua M. [1 ]
Akanksha [1 ]
Dua S. [2 ]
机构
[1] Department of Computer Engineering, National Institute of Technology, Kurukshetra
[2] Department of Electronics and Communication Engineering, National Institute of Technology, Kurukshetra
关键词
Acoustic modeling; Feature extraction; Noise robust ASR; Word error rate (WER);
D O I
10.1007/s10772-023-10033-0
中图分类号
学科分类号
摘要
Automatic Speech Recognition (ASR) system is an emerging technology used in various fields such as robotics, traffic controls, and healthcare, etc. The leading cause of ASR performance degradation is mismatch between the training and testing environments. The main reason for this mismatch is the presence of noise during the testing phase of an ASR system. Various techniques have been used by different researchers in front and backend phases of ASR, to detect and handle the noise. However, a very few review papers have considered noise as a criterion to present the comparison among the existing research works. Hence, the objective of this survey is to analyze and review all the effective methods proposed by different scientists and researchers to boost the noise robustness of an ASR system. Initially, the paper discusses the basic architecture of an ASR system, the factors affecting the its performance, and noise problem formulation. Secondly, the work analysis existing state of the art noise robust ASR methods in terms of front end feature extraction techniques and backend classification model. Then, a detailed review in terms of various speech databases, that are used by these methods, is given. Finally, an analysis in terms of performance metrics of all these noise-resistant ASR techniques is presented. Also, the paper discusses various feature extraction techniques, backend classification methods, different speech databases and performance metrics in detail, while presenting the analysis. The paper also discusses the existing challenges, and describes future research directions in the area of building noise-resistant ASR systems. © 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
引用
收藏
页码:475 / 519
页数:44
相关论文
共 50 条
  • [21] A companding front end for noise-robust automatic speech recognition
    Guinness, J
    Raj, B
    Schmidt-Nielsen, B
    Turicchia, L
    Sarpeshkar, R
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 249 - 252
  • [22] Noise-Robust Algorithm of Speech Features Extraction for Automatic Speech Recognition System
    Yakhnev, A. N.
    Pisarev, A. S.
    PROCEEDINGS OF THE XIX IEEE INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND MEASUREMENTS (SCM 2016), 2016, : 206 - 208
  • [23] Multiple resolution analysis for robust automatic speech recognition
    Gemello, R
    Mana, F
    Albesano, D
    De Mori, R
    COMPUTER SPEECH AND LANGUAGE, 2006, 20 (01): : 2 - 21
  • [24] Mixtures of Bayesian Joint Factor Analyzers for Noise Robust Automatic Speech Recognition
    Cui, Xiaodong
    Goel, Vaibhava
    Kingsbury, Brian
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3011 - 3015
  • [25] Sparse coding of the modulation spectrum for noise-robust automatic speech recognition
    Sara Ahmadi
    Seyed Mohammad Ahadi
    Bert Cranen
    Lou Boves
    EURASIP Journal on Audio, Speech, and Music Processing, 2014
  • [26] Novel frequency masking curves for noise-robust automatic speech recognition
    Chen, Chia-Ping
    Yeh, Ja-Zang
    Wu, Bo-Feng
    JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 2013, 36 (06) : 696 - 703
  • [27] Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition
    Gemmeke, Jort F.
    Virtanen, Tuomas
    Hurmalainen, Antti
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (07): : 2067 - 2080
  • [28] Sparse coding of the modulation spectrum for noise-robust automatic speech recognition
    Ahmadi, Sara
    Ahadi, Seyed Mohammad
    Cranen, Bert
    Boves, Lou
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014, : 1 - 20
  • [29] A MODULATION FEATURE SET FOR ROBUST AUTOMATIC SPEECH RECOGNITION IN ADDITIVE NOISE AND REVERBERATION
    Liu, Xiaoyu
    Sadeghian, Roozbeh
    Zahorian, Stephen A.
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5230 - 5234
  • [30] Employing Robust Principal Component Analysis for Noise-Robust Speech Feature Extraction in Automatic Speech Recognition with the Structure of a Deep Neural Network
    Hung, Jeih-weih
    Lin, Jung-Shan
    Wu, Po-Jen
    APPLIED SYSTEM INNOVATION, 2018, 1 (03) : 1 - 14