Noise robust automatic speech recognition: review and analysis

被引：3

作者：

Dua M. ^{[1
]}

Akanksha ^{[1
]}

Dua S. ^{[2
]}

机构：

[1] Department of Computer Engineering, National Institute of Technology, Kurukshetra

[2] Department of Electronics and Communication Engineering, National Institute of Technology, Kurukshetra

来源：

International Journal of Speech Technology | 2023年 / 26卷 / 02期

关键词：

Acoustic modeling; Feature extraction; Noise robust ASR; Word error rate (WER);

D O I：

10.1007/s10772-023-10033-0

中图分类号：

学科分类号：

摘要：

Automatic Speech Recognition (ASR) system is an emerging technology used in various fields such as robotics, traffic controls, and healthcare, etc. The leading cause of ASR performance degradation is mismatch between the training and testing environments. The main reason for this mismatch is the presence of noise during the testing phase of an ASR system. Various techniques have been used by different researchers in front and backend phases of ASR, to detect and handle the noise. However, a very few review papers have considered noise as a criterion to present the comparison among the existing research works. Hence, the objective of this survey is to analyze and review all the effective methods proposed by different scientists and researchers to boost the noise robustness of an ASR system. Initially, the paper discusses the basic architecture of an ASR system, the factors affecting the its performance, and noise problem formulation. Secondly, the work analysis existing state of the art noise robust ASR methods in terms of front end feature extraction techniques and backend classification model. Then, a detailed review in terms of various speech databases, that are used by these methods, is given. Finally, an analysis in terms of performance metrics of all these noise-resistant ASR techniques is presented. Also, the paper discusses various feature extraction techniques, backend classification methods, different speech databases and performance metrics in detail, while presenting the analysis. The paper also discusses the existing challenges, and describes future research directions in the area of building noise-resistant ASR systems. © 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.

引用

页码：475 / 519

页数：44

共 50 条

[31] Noise robust automatic speech recognition with adaptive quantile based noise estimation and speech band emphasizing filter bank
Bonde, CS
Graversen, C
Gregersen, AG
Ngo, KH
Normark, K
Purup, M
Thorsen, T
Lindberg, B
NONLINEAR ANALYSES AND ALGORITHMS FOR SPEECH PROCESSING, 2005, 3817 : 291 - 302
[32] A Review of Signal Subspace Speech Enhancement and Its Application to Noise Robust Speech Recognition
Kris Hermus
Patrick Wambacq
Hugo Van hamme
EURASIP Journal on Advances in Signal Processing, 2007
[33] A review of signal subspace speech enhancement and its application to noise robust speech recognition
Hermus, Kris
Wambacq, Patrick
Van hamme, Hugo
EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2007, 2007 (1)
[34] Automatic speech recognition: A review
Haton, JP
ENTERPRISE INFORMATION SYSTEMS V, 2004, : 6 - 11
[35] Knowledge Distillation-Based Training of Speech Enhancement for Noise-Robust Automatic Speech Recognition
Woo Lee, Geon
Kook Kim, Hong
Kong, Duk-Jo
IEEE ACCESS, 2024, 12 : 72707 - 72720
[36] Noise suppression based on wavelet packet decomposition and quantile noise estimation for robust automatic speech recognition
Rank, Erhard
Van Pham, Tuan
Kubin, Gernot
2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 477 - 480
[37] NOISE ADAPTIVE TRAINING USING A VECTOR TAYLOR SERIES APPROACH FOR NOISE ROBUST AUTOMATIC SPEECH RECOGNITION
Kalinli, Ozlem
Seltzer, Michael L.
Acero, Alex
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3825 - 3828
[38] Noise-robust automatic speech recognition using a predictive echo state network
Skowronski, Mark D.
Harris, John G.
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (05): : 1724 - 1730
[39] A Novel Model Characteristics for Noise-Robust Automatic Speech Recognition Based on HMM
Rafieee, M. Saadeq
Khazaei, Ali Akbar
2010 IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND INFORMATION SECURITY (WCNIS), VOL 2, 2010, : 215 - 218
[40] Noise-robust automatic speech recognition using a discriminative echo state network
Skowronski, Mark D.
Harris, John G.
2007 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, 2007, : 1771 - 1774

← 1 2 3 4 5 →