Enhancement of noisy speech by temporal and spectral processing

被引：29

作者：

Krishnamoorthy, P. ^{[2
]}

Prasanna, S. R. M. ^{[1
]}

机构：

[1] Indian Inst Technol, Dept Elect & Commun Engn, Gauhati 781039, Assam, India

[2] Samsung India Software Ctr, Noida 201301, India

来源：

SPEECH COMMUNICATION | 2011年 / 53卷 / 02期

关键词：

Speech enhancement; Temporal processing; Spectral processing; Temporal and spectral processing; LINEAR PREDICTION; SUBTRACTION METHOD; REDUCTION; DATABASE;

D O I：

10.1016/j.specom.2010.08.011

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents a noisy speech enhancement method by combining linear prediction (LP) residual weighting in the time domain and spectral processing in the frequency domain to provide better noise suppression as well as better enhancement in the speech regions The noisy speech is initially processed by the excitation source (LP residual) based temporal processing that Involves identifying and enhancing the excitation source based speech-specific features present at the gross and fine temporal levels The gross level features are identified by estimating the following speech parameters sum of the peaks in the discrete Fourier transform (DFT) spectrum, smoothed Hilbert envelope of the LP residual and modulation spectrum values, all from the noisy speech signal The fine level features are identified using the knowledge of the instants of significant excitation A weight function is derived from the gross and fine weight functions to obtain the temporally processed speech signal The temporally processed speech is further subjected to spectral domain processing Spectral processing involves estimation and removal of degrading components, and also identification and enhancement of speech-specific spectral components The proposed method is evaluated using different objective and subjective quality measures The quality measures show that the proposed combined temporal and spectral processing method provides better enhancement compared to either temporal or spectral processing alone (C) 2010 Elsevier B V All rights reserved

引用

页码：154 / 174

页数：21

共 46 条

[11] DE-NOISING BY SOFT-THRESHOLDING [J].

DONOHO, DL .

IEEE TRANSACTIONS ON INFORMATION THEORY, 1995, 41 (03) :613-627

[12] SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR LOG-SPECTRAL AMPLITUDE ESTIMATOR [J].

EPHRAIM, Y ;

MALAH, D .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1985, 33 (02) :443-445

[13] SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR SHORT-TIME SPECTRAL AMPLITUDE ESTIMATOR [J].

EPHRAIM, Y ;

MALAH, D .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (06) :1109-1121

[14]

Greenberg S, 1997, INT CONF ACOUST SPEE, P1647, DOI 10.1109/ICASSP.1997.598826

[15]

Hu Y, 2006, P INT PHIL PA US

[16] Evaluation of objective quality measures for speech enhancement [J].

Hu, Yi ;

Loizou, Philipos C. .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (01) :229-238

[17] Subjective comparison and evaluation of speech enhancement algorithms [J].

Hu, Yi ;

Loizou, Philipos C. .

SPEECH COMMUNICATION, 2007, 49 (7-8) :588-601

[18] Speech enhancement by residual domain constrained optimization [J].

Jin, Wen ;

Scordilis, Michael S. .

SPEECH COMMUNICATION, 2006, 48 (10) :1349-1364

[19] Spectral subtraction based on phonetic dependency and masking effects [J].

Kim, W ;

Kang, S ;

Ko, H .

IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 2000, 147 (05) :423-427

[20] Temporal and Spectral Processing of Degraded Speech [J].

Krishnamoorthy, P. ;

Prasanna, S. R. Mahadeva .

ADCOM: 2008 16TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATIONS, 2008, :112-118

← 1 2 3 4 5 →