Enhancement of noisy speech by temporal and spectral processing

被引:29
作者
Krishnamoorthy, P. [2 ]
Prasanna, S. R. M. [1 ]
机构
[1] Indian Inst Technol, Dept Elect & Commun Engn, Gauhati 781039, Assam, India
[2] Samsung India Software Ctr, Noida 201301, India
关键词
Speech enhancement; Temporal processing; Spectral processing; Temporal and spectral processing; LINEAR PREDICTION; SUBTRACTION METHOD; REDUCTION; DATABASE;
D O I
10.1016/j.specom.2010.08.011
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a noisy speech enhancement method by combining linear prediction (LP) residual weighting in the time domain and spectral processing in the frequency domain to provide better noise suppression as well as better enhancement in the speech regions The noisy speech is initially processed by the excitation source (LP residual) based temporal processing that Involves identifying and enhancing the excitation source based speech-specific features present at the gross and fine temporal levels The gross level features are identified by estimating the following speech parameters sum of the peaks in the discrete Fourier transform (DFT) spectrum, smoothed Hilbert envelope of the LP residual and modulation spectrum values, all from the noisy speech signal The fine level features are identified using the knowledge of the instants of significant excitation A weight function is derived from the gross and fine weight functions to obtain the temporally processed speech signal The temporally processed speech is further subjected to spectral domain processing Spectral processing involves estimation and removal of degrading components, and also identification and enhancement of speech-specific spectral components The proposed method is evaluated using different objective and subjective quality measures The quality measures show that the proposed combined temporal and spectral processing method provides better enhancement compared to either temporal or spectral processing alone (C) 2010 Elsevier B V All rights reserved
引用
收藏
页码:154 / 174
页数:21
相关论文
共 46 条
[11]   DE-NOISING BY SOFT-THRESHOLDING [J].
DONOHO, DL .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1995, 41 (03) :613-627
[12]   SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR LOG-SPECTRAL AMPLITUDE ESTIMATOR [J].
EPHRAIM, Y ;
MALAH, D .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1985, 33 (02) :443-445
[13]   SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR SHORT-TIME SPECTRAL AMPLITUDE ESTIMATOR [J].
EPHRAIM, Y ;
MALAH, D .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (06) :1109-1121
[14]  
Greenberg S, 1997, INT CONF ACOUST SPEE, P1647, DOI 10.1109/ICASSP.1997.598826
[15]  
Hu Y, 2006, P INT PHIL PA US
[16]   Evaluation of objective quality measures for speech enhancement [J].
Hu, Yi ;
Loizou, Philipos C. .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (01) :229-238
[17]   Subjective comparison and evaluation of speech enhancement algorithms [J].
Hu, Yi ;
Loizou, Philipos C. .
SPEECH COMMUNICATION, 2007, 49 (7-8) :588-601
[18]   Speech enhancement by residual domain constrained optimization [J].
Jin, Wen ;
Scordilis, Michael S. .
SPEECH COMMUNICATION, 2006, 48 (10) :1349-1364
[19]   Spectral subtraction based on phonetic dependency and masking effects [J].
Kim, W ;
Kang, S ;
Ko, H .
IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 2000, 147 (05) :423-427
[20]   Temporal and Spectral Processing of Degraded Speech [J].
Krishnamoorthy, P. ;
Prasanna, S. R. Mahadeva .
ADCOM: 2008 16TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATIONS, 2008, :112-118