Statistical Model of Speech Signals Based on Composite Autoregressive System with Application to Blind Source Separation

被引:25
作者
Kameoka, Hirokazu [1 ]
Yoshioka, Takuya [1 ]
Hamamura, Mariko [1 ]
Le Roux, Jonathan [1 ]
Kashino, Kunio [1 ]
机构
[1] NTT Corp, NTT Commun Sci Labs, Atsugi, Kanagawa 2430198, Japan
来源
LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION | 2010年 / 6365卷
关键词
Blind source separation; composite autoregressive system; NONNEGATIVE MATRIX FACTORIZATION; AUDIO SOURCE SEPARATION; FREQUENCY-DOMAIN; MIXTURES;
D O I
10.1007/978-3-642-15995-4_31
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper presents a new statistical model for speech signals, which consists of a time-invariant dictionary incorporating a set of the power spectral densities of excitation signals and a set of all-pole filters where the gain of each pair of excitation and filter elements is allowed to vary over time. We use this model to develop a combined blind separation and dereverberation method for speech. Reasonably good separations were obtained under a highly reverberant condition.
引用
收藏
页码:245 / 253
页数:9
相关论文
共 11 条
  • [1] Audio source separation with a single sensor
    Benaroya, L
    Bimbot, F
    Gribonval, R
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (01): : 191 - 199
  • [2] Separation of an instantaneous mixture of Gaussian autoregressive sources by the exact maximum likelihood approach
    Dégerine, S
    Zaïdi, A
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2004, 52 (06) : 1499 - 1512
  • [3] Natural gradient multichannel blind deconvolution and speech separation using causal FIR filters
    Douglas, SC
    Sawada, H
    Makino, S
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (01): : 92 - 104
  • [4] Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis
    Fevotte, Cedric
    Bertin, Nancy
    Durrieu, Jean-Louis
    [J]. NEURAL COMPUTATION, 2009, 21 (03) : 793 - 830
  • [5] Composite Autoregressive System for Sparse Source-Filter Representation of Speech
    Kameoka, Hirokazu
    Kashino, Kunio
    [J]. ISCAS: 2009 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-5, 2009, : 2477 - 2480
  • [6] Blind speech dereverberation with multi-channel linear prediction based on short time Fourier transform representation
    Nakatani, Tomohiro
    Yoshioka, Takuya
    Kinoshita, Keisuke
    Miyoshi, Masato
    Juang, Biing-Hwang
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 85 - 88
  • [7] Emergence of simple-cell receptive field properties by learning a sparse code for natural images
    Olshausen, BA
    Field, DJ
    [J]. NATURE, 1996, 381 (6583) : 607 - 609
  • [8] Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation
    Ozerov, Alexey
    Fevotte, Cedric
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (03): : 550 - 563
  • [9] Measuring dependence of bin-wise separated signals for permutation alignment in frequency-domain BSS
    Sawada, Hiroshi
    Araki, Shoko
    Makino, Shoji
    [J]. 2007 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, 2007, : 3247 - 3250
  • [10] Blind separation of convolved mixtures in the frequency domain
    Smaragdis, P
    [J]. NEUROCOMPUTING, 1998, 22 (1-3) : 21 - 34