Single-channel multiple regression for in-car speech enhancement

被引:1
作者
Li, WF [1 ]
Itou, K
Takeda, K
Itakura, F
机构
[1] Nagoya Univ, Grad Sch Engn, Dept Informat Elect, Nagoya, Aichi 4648603, Japan
[2] Nagoya Univ, Grad Sch Engn, Dept Med Sci, Nagoya, Aichi 4648603, Japan
[3] Meijo Univ, Fac Sci & Technol, Nagoya, Aichi 4688502, Japan
关键词
speech enhancement; speech recognition; multi-layer perceptron; mean opinion score; pairwise preference test; environmental adaptation; K-means clustering;
D O I
10.1093/ietisy/e89-d.3.1032
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We address issues for improving hands-free speech enhancement and speech recognition performance in different car environments using a single distant microphone. This paper describes a new single-channel in-car speech enhancement method that estimates the log spectra of speech at a close-talking microphone based on the nonlinear regression of the log spectra of noisy signal captured by a distant microphone and the estimated noise. The proposed method provides significant overall quality improvements in our subjective evaluation on the regression-enhanced speech, and performed best in most objective measures. Based on our isolated word recognition experiments conducted under 15 real car environments, the proposed adaptive nonlinear regression approach shows an advantage in average relative word error rate (WER) reductions of 50.8% and 13.1%, respectively, compared to original noisy speech and ETSI advanced front-end (ETSI ES 202 050).
引用
收藏
页码:1032 / 1039
页数:8
相关论文
共 25 条
[1]  
[Anonymous], 1998, COMPUTATIONAL AUDITO
[2]   SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION [J].
BOLL, SF .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02) :113-120
[3]   EVALUATION OF SHORT-TIME SPECTRAL ATTENUATION TECHNIQUES FOR THE RESTORATION OF MUSICAL RECORDINGS [J].
CAPPE, O ;
LAROCHE, J .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1995, 3 (01) :84-93
[4]   A comparison of features for speech, music discrimination. [J].
Carey, MJ ;
Parris, ES ;
Lloyd-Thomas, H .
ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, :149-152
[5]   Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging [J].
Cohen, I .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (05) :466-475
[6]  
Deller J.R., 1993, Discrete-time processing of speech signals
[7]   SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR LOG-SPECTRAL AMPLITUDE ESTIMATOR [J].
EPHRAIM, Y ;
MALAH, D .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1985, 33 (02) :443-445
[8]   SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR SHORT-TIME SPECTRAL AMPLITUDE ESTIMATOR [J].
EPHRAIM, Y ;
MALAH, D .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (06) :1109-1121
[9]  
Hansen J.H. L., 1998, INT C SPEECH LANGUAG, V7, P2819
[10]  
Haykin S., 1999, Neural Networks: A Comprehensive Foundation, V2nd ed