Noise robust automatic speech recognition: review and analysis

被引:3
|
作者
Dua M. [1 ]
Akanksha [1 ]
Dua S. [2 ]
机构
[1] Department of Computer Engineering, National Institute of Technology, Kurukshetra
[2] Department of Electronics and Communication Engineering, National Institute of Technology, Kurukshetra
关键词
Acoustic modeling; Feature extraction; Noise robust ASR; Word error rate (WER);
D O I
10.1007/s10772-023-10033-0
中图分类号
学科分类号
摘要
Automatic Speech Recognition (ASR) system is an emerging technology used in various fields such as robotics, traffic controls, and healthcare, etc. The leading cause of ASR performance degradation is mismatch between the training and testing environments. The main reason for this mismatch is the presence of noise during the testing phase of an ASR system. Various techniques have been used by different researchers in front and backend phases of ASR, to detect and handle the noise. However, a very few review papers have considered noise as a criterion to present the comparison among the existing research works. Hence, the objective of this survey is to analyze and review all the effective methods proposed by different scientists and researchers to boost the noise robustness of an ASR system. Initially, the paper discusses the basic architecture of an ASR system, the factors affecting the its performance, and noise problem formulation. Secondly, the work analysis existing state of the art noise robust ASR methods in terms of front end feature extraction techniques and backend classification model. Then, a detailed review in terms of various speech databases, that are used by these methods, is given. Finally, an analysis in terms of performance metrics of all these noise-resistant ASR techniques is presented. Also, the paper discusses various feature extraction techniques, backend classification methods, different speech databases and performance metrics in detail, while presenting the analysis. The paper also discusses the existing challenges, and describes future research directions in the area of building noise-resistant ASR systems. © 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
引用
收藏
页码:475 / 519
页数:44
相关论文
共 50 条
  • [1] Environmental Noise Analysis for Robust Automatic Speech Recognition
    Kishore, N. Sai Bala
    Venkata, M. Rao
    Nagamani, M.
    ADVANCED COMPUTER AND COMMUNICATION ENGINEERING TECHNOLOGY, 2015, 315
  • [2] An overview of noise-robust automatic speech recognition
    Li, Jinyu
    Deng, Li
    Gong, Yifan
    Haeb-Umbach, Reinhold
    IEEE Transactions on Audio, Speech and Language Processing, 2014, 22 (04): : 745 - 777
  • [3] Robust automatic speech recognition in the presence of impulsive noise
    Potamitis, I
    Fakotakis, N
    Kokkinakis, G
    ELECTRONICS LETTERS, 2001, 37 (12) : 799 - 800
  • [4] An Overview of Noise-Robust Automatic Speech Recognition
    Li, Jinyu
    Deng, Li
    Gong, Yifan
    Haeb-Umbach, Reinhold
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (04) : 745 - 777
  • [5] Noise Adaptive Training for Robust Automatic Speech Recognition
    Kalinli, Ozlem
    Seltzer, Michael L.
    Droppo, Jasha
    Acero, Alex
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (08): : 1889 - 1901
  • [6] CEPSTRAL NOISE SUBTRACTION FOR ROBUST AUTOMATIC SPEECH RECOGNITION
    Rehr, Robert
    Gerkmann, Timo
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 375 - 378
  • [7] Robust automatic speech recognition in impulsive noise environment
    Ding, P
    Cao, ZG
    CHINESE JOURNAL OF ELECTRONICS, 2005, 14 (01): : 165 - 168
  • [8] Noise Robust Speech Features for Automatic Continuous Speech Recognition using Running Spectrum Analysis
    Ohnuki, Kazunaga
    Takahashi, Wataru
    Yoshizawa, Shingo
    Miyanaga, Yoshikazu
    2008 INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES, 2008, : 150 - 153
  • [9] Issues with uncertainty decoding for noise robust automatic speech recognition
    Liao, H.
    Gales, M. J. F.
    SPEECH COMMUNICATION, 2008, 50 (04) : 265 - 277
  • [10] JOINT NOISE ADAPTIVE TRAINING FOR ROBUST AUTOMATIC SPEECH RECOGNITION
    Narayanan, Arun
    Wang, DeLiang
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,