Statistical Model of Speech Signals Based on Composite Autoregressive System with Application to Blind Source Separation

被引：25

作者：

Kameoka, Hirokazu ^{[1
]}

Yoshioka, Takuya ^{[1
]}

Hamamura, Mariko ^{[1
]}

Le Roux, Jonathan ^{[1
]}

Kashino, Kunio ^{[1
]}

机构：

[1] NTT Corp, NTT Commun Sci Labs, Atsugi, Kanagawa 2430198, Japan

来源：

LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION | 2010年 / 6365卷

关键词：

Blind source separation; composite autoregressive system; NONNEGATIVE MATRIX FACTORIZATION; AUDIO SOURCE SEPARATION; FREQUENCY-DOMAIN; MIXTURES;

D O I：

10.1007/978-3-642-15995-4_31

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

This paper presents a new statistical model for speech signals, which consists of a time-invariant dictionary incorporating a set of the power spectral densities of excitation signals and a set of all-pole filters where the gain of each pair of excitation and filter elements is allowed to vary over time. We use this model to develop a combined blind separation and dereverberation method for speech. Reasonably good separations were obtained under a highly reverberant condition.

引用

页码：245 / 253

页数：9

共 11 条

[1] Audio source separation with a single sensor
Benaroya, L
Bimbot, F
Gribonval, R
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (01): : 191 - 199
[2] Separation of an instantaneous mixture of Gaussian autoregressive sources by the exact maximum likelihood approach
Dégerine, S
Zaïdi, A
[J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2004, 52 (06) : 1499 - 1512
[3] Natural gradient multichannel blind deconvolution and speech separation using causal FIR filters
Douglas, SC
Sawada, H
Makino, S
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (01): : 92 - 104
[4] Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis
Fevotte, Cedric
Bertin, Nancy
Durrieu, Jean-Louis
[J]. NEURAL COMPUTATION, 2009, 21 (03) : 793 - 830
[5] Composite Autoregressive System for Sparse Source-Filter Representation of Speech
Kameoka, Hirokazu
Kashino, Kunio
[J]. ISCAS: 2009 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-5, 2009, : 2477 - 2480
[6] Blind speech dereverberation with multi-channel linear prediction based on short time Fourier transform representation
Nakatani, Tomohiro
Yoshioka, Takuya
Kinoshita, Keisuke
Miyoshi, Masato
Juang, Biing-Hwang
[J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 85 - 88
[7] Emergence of simple-cell receptive field properties by learning a sparse code for natural images
Olshausen, BA
Field, DJ
[J]. NATURE, 1996, 381 (6583) : 607 - 609
[8] Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation
Ozerov, Alexey
Fevotte, Cedric
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (03): : 550 - 563
[9] Measuring dependence of bin-wise separated signals for permutation alignment in frequency-domain BSS
Sawada, Hiroshi
Araki, Shoko
Makino, Shoji
[J]. 2007 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, 2007, : 3247 - 3250
[10] Blind separation of convolved mixtures in the frequency domain
Smaragdis, P
[J]. NEUROCOMPUTING, 1998, 22 (1-3) : 21 - 34

← 1 2 →