Robust Speech Recognition Using MLP Neural Network in Log-Spectral Domain

被引：2

作者：

Ghaemmaghami, Masoumeh P. ^{[1
,2
]}

Sameti, Hossein ^{[3
]}

Razzazi, Farbod ^{[1
]}

BabaAli, Bagher ^{[3
]}

Dabbaghchian, Saeed ^{[3
]}

机构：

[1] Islamic Azad Univ, Fac Engn, Dept Elect Engn, Sci & Res Branch, Tehran, Iran

[2] Islamic Azad Univ, Fac Engn, Dept Elect Engn, Young Res Club, Tehran, Iran

[3] Sharif Univ Technol, Dept Comp Engn, Speech Proc Lab, Tehran, Iran

来源：

2009 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT 2009) | 2009年

关键词：

MLP neural network; log spectral; robust speech recognition;

D O I：

10.1109/ISSPIT.2009.5407513

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, we have proposed an efficient and effective nonlinear feature domain noise suppression algorithm, motivated by the minimum mean square error (MMSE) optimization criterion. A Multi Layer Perceptron (MLP) neural network in the log spectral domain has been employed to minimize the difference between noisy and clean speech. By using this method, as a pre-processing stage of a speech recognition system, the recognition rate in noisy environments has been improved. We extended the application of the system to different environments with different noises without retraining HMM model. We trained the feature extraction stage with a small portion of noisy data which was created by artificially adding different types of noises from the NOISEX-92 database to the TIMIT speech database. In real environment, where our speech recognition systems must work, different types of noises with various SNRs exist. Our proposed method suggests four strategies based on the system capability to identify the noise type and SNR. Experimental results show that the proposed method achieves significant improvement in recognition rates.

引用

页码：467 / +

页数：3

共 50 条

[1] Accurate compensation in the log-spectral domain for noisy speech recognition
Afify, M
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (03): : 388 - 398
[2] Combining log-spectral domain compensation with MVA feature post-processing for robust speech recognition
Lei, Jianjun
Wang, Jian
Guo, Jun
Liu, Gang
Shen, Haifeng
IIH-MSP: 2006 INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING, PROCEEDINGS, 2006, : 663 - +
[3] Log-spectral feature reconstruction based on an occlusion model for noise robust speech recognition
Gonzalez, Jose A.
Peinado, Antonio M.
Gomez, Angel M.
Ma, Ning
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2629 - 2632
[4] A NOVEL APPROACH TO SOFT-MASK ESTIMATION AND LOG-SPECTRAL ENHANCEMENT FOR ROBUST SPEECH RECOGNITION
van Hout, Julien
Alwan, Abeer
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4105 - 4108
[5] MODULATION-DOMAIN SPEECH ENHANCEMENT USING A KALMAN FILTER WITH A BAYESIAN UPDATE OF SPEECH AND NOISE IN THE LOG-SPECTRAL DOMAIN
Dionelis, Nikolaos
Brookes, Mike
2017 HANDS-FREE SPEECH COMMUNICATIONS AND MICROPHONE ARRAYS (HSCMA 2017), 2017, : 111 - 115
[6] Adaptive log-spectral regression for in-car speech recognition using multiple distributed microphones
Li, WF
Takeda, K
Itakura, F
IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (04) : 340 - 343
[7] Improved noise spectra estimation and log-spectral regression for in-car speech recognition
Li, W. (lee@sp.m.is.nagoya-u.ac.jp), Information Processing Society of Japan, IPSJ; The Database Society of Japan, DBSJ; The IEEE Computer Society; The Inst. of Elec., Info. and Com. Engineers, IEICE (IEEE Computer Society):
[8] NOISE ESTIMATION USING A CONSTRAINED SEQUENTIAL HMM IN LOG-SPECTRAL DOMAIN
Ying, Dongwen
Lu, Xugang
Li, Junfeng
Yan, Yonghong
Dang, Jianwu
Soong, Frank
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4553 - 4556
[9] Assessment of Disordered Voices Using Empirical Mode Decomposition in the Log-Spectral Domain
Kacha, A.
Grenez, F.
Schoentgen, J.
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 66 - 69
[10] On Reliability of Log-Spectral Distortion Measure in Speech Quality Estimation
Prodeus, Arkadiy
Kotvytskyi, Igor
2017 IEEE 4TH INTERNATIONAL CONFERENCE ACTUAL PROBLEMS OF UNMANNED AERIAL VEHICLES DEVELOPMENTS (APUAVD), 2017, : 121 - 124

← 1 2 3 4 5 →