An efficient speech recognition system in adverse conditions using the nonparametric regression

被引:10
作者
Amrouche, Abderrahmane [1 ]
Debyeche, Mohamed [1 ]
Taleb-Ahmed, Abdelmalik [2 ]
Rouvaen, Jean Michel [3 ]
Yagoub, Mustapha C. E. [4 ]
机构
[1] USTHB, Fac Elect & Comp Sci, Algiers 16111, Algeria
[2] Valenciennes Univ, CNRS, LAMIH, UMR 8530, F-59313 Le Mont Houy, France
[3] Valenciennes Univ, CNRS, OAE IEMN, UMR 8520, F-59313 Le Mont Houy, France
[4] Univ Ottawa, SITE, Ottawa, ON K1N 6N5, Canada
关键词
Arabic digits; Speech recognition; Nonparametric regression; General Regression Neural Network; Hidden Markov Model; Noisy environment; NEURAL-NETWORKS;
D O I
10.1016/j.engappai.2009.09.006
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
General Regression Neural Networks (GRNN) have been applied to phoneme identification and isolated word recognition in clean speech. In this paper, the authors extended this approach to Arabic spoken word recognition in adverse conditions. In fact, noise robustness is one of the most challenging problems in Automatic Speech Recognition (ASR) and most of the existing recognition methods, which have shown to be highly efficient under noise-free conditions, fail drastically in noisy environments. The proposed system was tested for Arabic digit recognition at different Signal-to-Noise Ratio (SNR) levels and under four noisy conditions: multispeakers babble background, car production hall (factory), military vehicle (leopard tank) and fighter jet cockpit (buccaneer) issued from NOISEX-92 database. The proposed scheme was successfully compared to the similar recognizers based on the Multilayer Perceptrons (MLP), the Elman Recurrent Neural Network (RNN) and the discrete Hidden Markov Model (HMM). The experimental results showed that the use of nonparametric regression with an appropriate smoothing factor (spread) improved the generalization power of the neural network and the global performance of the speech recognizer in noisy environments. (C) 2009 Elsevier Ltd. All rights reserved.
引用
收藏
页码:85 / 94
页数:10
相关论文
共 31 条
  • [1] ALOTAIBI YA, 2005, INFORM COMPUT SCI, V173, P105
  • [2] Amrouche A, 2003, Proceedings of the 46th IEEE International Midwest Symposium on Circuits & Systems, Vols 1-3, P689
  • [3] AMROUCHE A, 2004, P 9 INT C SPEECH COM, P276
  • [4] [Anonymous], 1997, Statistical methods for speech recognition
  • [5] Bhatti MW, 2004, 2004 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL 2, PROCEEDINGS, P181
  • [6] Speeh/Music classification by using statistical neural networks
    Bolat, B
    Küçük, Ü
    [J]. PROCEEDINGS OF THE IEEE 12TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, 2004, : 227 - 229
  • [7] BOURLARD H, 1994, CONNEXIONIST SPEECH
  • [8] BOURLARD H, 1996, SURVEY STATE ART HUM
  • [9] BOUROUBA H, 2006, P 2 IEEE INT C INF C, P1264
  • [10] ESTIMATION OF A MULTIVARIATE DENSITY
    CACOULLOS, T
    [J]. ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 1966, 18 (02) : 179 - +