Speech enhancement using generalized weighted β-order spectral amplitude estimator

被引:15
作者
Deng, Feng [1 ]
Bao, Feng [1 ]
Bao, Chang-chun [1 ]
机构
[1] Beijing Univ Technol, Sch Elect Informat & Control Engn, Speech & Audio Signal Proc Lab, Beijing 100124, Peoples R China
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
Speech enhancement; Auditory masking properties; Generalized weighed spectral amplitude estimator; A priori SNR estimation; NOISE;
D O I
10.1016/j.specom.2014.01.002
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, a single-channel speech enhancement method based on generalized weighted beta-order spectral amplitude estimator is proposed. First, we derive a new kind of generalized weighted beta-order Bayesian spectral amplitude estimator, which takes full advantage of both the traditional perceptually weighted estimators and beta-order spectral amplitude estimators and can obtain flexible and effective gain function. Second, according to the masking properties of human auditory system, the adaptive estimation methods for the perceptually weighted order p is proposed, which is based on a criterion that inaudible noise may be masked rather than removed. Thereby, the distortion of enhanced speech is reduced. Third, based on the compressive nonlinearity of the cochlea, the spectral amplitude order beta can be interpreted as the compression rate of the spectral amplitude, and then the adaptive calculation method of parameter beta is proposed. In addition, due to one frame delay, the a priori SNR estimation of decision-directed method in speech activity periods is inaccurate. In order to overcome the drawback, we present a new a priori SNR estimation method by combining predicted estimation with decision-directed rule. The subjective and objective test results indicate that the proposed Bayesian spectral amplitude estimator combined with the proposed a priori SNR estimation method can achieve a more significant segmental SNR improvement, a lower log-spectral distortion and a better speech quality over the reference methods. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:55 / 68
页数:14
相关论文
共 29 条
[1]   Simultaneous detection and estimation approach for speech enhancement [J].
Abramson, Ari ;
Cohen, Israel .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (08) :2348-2359
[2]  
[Anonymous], 1988, Objective measures of speech quality
[3]  
[Anonymous], 1993, REC P 56 OBJ MEAS AC
[4]  
[Anonymous], 2001, REC P 862 PERC EV SP
[5]   SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION [J].
BOLL, SF .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02) :113-120
[6]   Voice activity detection based on multiple statistical models [J].
Chang, Joon-Hyuk ;
Kim, Nam Soo ;
Mitra, Sanjit K. .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2006, 54 (06) :1965-1976
[7]   Noise estimation by minima controlled recursive averaging for robust speech enhancement [J].
Cohen, I ;
Berdugo, B .
IEEE SIGNAL PROCESSING LETTERS, 2002, 9 (01) :12-15
[8]  
Deng F., 2011, 2011 INT C WIR COMM, P1
[9]   DE-NOISING BY SOFT-THRESHOLDING [J].
DONOHO, DL .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1995, 41 (03) :613-627
[10]   SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR LOG-SPECTRAL AMPLITUDE ESTIMATOR [J].
EPHRAIM, Y ;
MALAH, D .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1985, 33 (02) :443-445