Iterative Phase Estimation for the Synthesis of Separated Sources From Single-Channel Mixtures

被引：54

作者：

Gunawan, David ^{[1
]}

Sen, D. ^{[1
]}

机构：

[1] Univ New S Wales, Sch Elect Engn & Telecommun, Sydney, NSW 2052, Australia

来源：

IEEE SIGNAL PROCESSING LETTERS | 2010年 / 17卷 / 05期

关键词：

Phase estimation; source separation; synthesis; SPEECH;

D O I：

10.1109/LSP.2010.2042530

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this letter, we propose a novel method of refining the time-domain synthesis of individual source estimates from a single channel mixture. Employing a closed-loop architecture, the algorithm refines the synthesis of each source by iteratively estimating the phase of the sources, given the estimates of the source magnitude spectra and a single channel time-domain mixture. The performance of the algorithm is evaluated for harmonic musical mixtures, and considerable improvements to the synthesized estimates are obtained relative to phase binary masking, given accurate source magnitude spectra.

引用

页码：421 / 424

页数：4

共 11 条

[1] Iterative reconstruction of speech from short-time Fourier transform phase and magnitude spectra [J].

Alsteris, Leigh D. ;

Paliwal, Kuldip K. .

COMPUTER SPEECH AND LANGUAGE, 2007, 21 (01) :174-186

[2] Separation of synchronous pitched notes by spectral filtering of harmonics [J].

Every, Mark R. ;

Szymanski, John E. .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (05) :1845-1856

[3] SIGNAL ESTIMATION FROM MODIFIED SHORT-TIME FOURIER-TRANSFORM [J].

GRIFFIN, DW ;

LIM, JS .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1984, 32 (02) :236-243

[4]

Gunawan D., 2009, IEEE INT WORKSH MULT

[5]

GUNAWAN D, 2005, P INT C INF COMM SIG, P1452

[6]

Kabal P., 2003, EXAMINATION INTERPRE

[7]

KLAPURI A, 1998, P EUR SIGN PROC C

[8] SPEECH ANALYSIS SYNTHESIS BASED ON A SINUSOIDAL REPRESENTATION [J].

MCAULAY, RJ ;

QUATIERI, TF .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1986, 34 (04) :744-754

[9]

VIRTANEN T, 2000, ICASSP2000, V2, P765

[10] On ideal binary mask as the computational goal of auditory scene analysis [J].

Wang, DL .

SPEECH SEPARATION BY HUMANS AND MACHINES, 2005, :181-197

← 1 2 →