Enhancement of noisy speech by temporal and spectral processing

被引：29

作者：

Krishnamoorthy, P. ^{[2
]}

Prasanna, S. R. M. ^{[1
]}

机构：

[1] Indian Inst Technol, Dept Elect & Commun Engn, Gauhati 781039, Assam, India

[2] Samsung India Software Ctr, Noida 201301, India

来源：

SPEECH COMMUNICATION | 2011年 / 53卷 / 02期

关键词：

Speech enhancement; Temporal processing; Spectral processing; Temporal and spectral processing; LINEAR PREDICTION; SUBTRACTION METHOD; REDUCTION; DATABASE;

D O I：

10.1016/j.specom.2010.08.011

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents a noisy speech enhancement method by combining linear prediction (LP) residual weighting in the time domain and spectral processing in the frequency domain to provide better noise suppression as well as better enhancement in the speech regions The noisy speech is initially processed by the excitation source (LP residual) based temporal processing that Involves identifying and enhancing the excitation source based speech-specific features present at the gross and fine temporal levels The gross level features are identified by estimating the following speech parameters sum of the peaks in the discrete Fourier transform (DFT) spectrum, smoothed Hilbert envelope of the LP residual and modulation spectrum values, all from the noisy speech signal The fine level features are identified using the knowledge of the instants of significant excitation A weight function is derived from the gross and fine weight functions to obtain the temporally processed speech signal The temporally processed speech is further subjected to spectral domain processing Spectral processing involves estimation and removal of degrading components, and also identification and enhancement of speech-specific spectral components The proposed method is evaluated using different objective and subjective quality measures The quality measures show that the proposed combined temporal and spectral processing method provides better enhancement compared to either temporal or spectral processing alone (C) 2010 Elsevier B V All rights reserved

引用

页码：154 / 174

页数：21

共 46 条

[1] EPOCH EXTRACTION FROM LINEAR PREDICTION RESIDUAL FOR IDENTIFICATION OF CLOSED GLOTTIS INTERVAL [J].

ANANTHAPADMANABHA, TV ;

YEGNANARAYANA, B .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (04) :309-319

[2]

[Anonymous], 1993, Discrete-Time Processing of Speech Signals

[3]

[Anonymous], NUMERICAL RECIPES C

[4]

[Anonymous], 2007, Speech Enhancement: Theory and Practice

[5]

[Anonymous], 2002, P IEEE ICASSP

[6]

Berouti M., 1979, ICASSP 79. 1979 IEEE International Conference on Acoustics, Speech and Signal Processing, P208

[7] SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION [J].

BOLL, SF .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02) :113-120

[8] Multiple statistical models for soft decision in noisy speech enhancement [J].

Chang, Joon-Hyuk ;

Gazor, Saeed ;

Kim, Nam Soo ;

Mitra, Sanjit K. .

PATTERN RECOGNITION, 2007, 40 (03) :1123-1134

[9]

Chen B, 2005, INT CONF ACOUST SPEE, P1097

[10] A Laplacian-based MMSE estimator for speech enhancement [J].

Chen, Bin ;

Loizou, Philipos C. .

SPEECH COMMUNICATION, 2007, 49 (02) :134-143

← 1 2 3 4 5 →