Multi-Channel Audio Source Separation Using Multiple Deformed References

被引:19
|
作者
Souviraa-Labastic, Nathan [1 ]
Olivero, Anaik [1 ]
Vincent, Emmanuel [2 ]
Bimbot, Frederic [1 ]
机构
[1] Univ Rennes 1, CNRS, Inria, PANAMA,Project Team,IRISA, F-35000 Rennes, France
[2] Inria, F-54600 Villers Les Nancy, France
基金
欧洲研究理事会;
关键词
Generalized Expectation-Maximization (GEM) algorithm; source separation; NONNEGATIVE MATRIX FACTORIZATION; BLIND; INFORMATION; MODELS;
D O I
10.1109/TASLP.2015.2450494
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We present a general multi-channel source separation framework where additional audio references are available for one (or more) source(s) of a given mixture. Each audio reference is another mixture which is supposed to contain at least one source similar to one of the target sources. Deformations between the sources of interest and their references are modeled in a linear manner using a generic formulation. This is done by adding transformation matrices to an excitation-filter model, hence affecting different axes, namely frequency, dictionary component or time. A nonnegative matrix co-factorization algorithm and a generalized expectation-maximization algorithm are used to estimate the parameters of the model. Different model parameterizations and different combinations of algorithms are tested on music plus voice mixtures guided by music and/or voice references and on professionally-produced music recordings guided by cover references. Our algorithms improve the signal-to-distortion ratio (SDR) of the sources with the lowest intensity by 9 to 15 decibels (dB) with respect to original mixtures.
引用
收藏
页码:1775 / 1787
页数:13
相关论文
共 50 条
  • [1] AUDIO SOURCE SEPARATION USING MULTIPLE DEFORMED REFERENCES
    Souviraa-Labastie, Nathan
    Olivero, Anaik
    Vincent, Emmanuel
    Bimbot, Frederic
    2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 311 - 315
  • [2] Raw Multi-Channel Audio Source Separation using Multi-Resolution Convolutional Auto-Encoders
    Grais, Emad M.
    Ward, Dominic
    Plumbley, Mark D.
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 1577 - 1581
  • [3] Multi-Channel Audio Source Separation Using Azimuth-Frequency Analysis and Convolutional Neural Network
    Moon, Jung Min
    Kim, Jun Ho
    Kim, Tae Woo
    Chun, Chan Jun
    Kim, Hong Kook
    2019 1ST INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (ICAIIC 2019), 2019, : 500 - 503
  • [4] Multi-channel underdetermined blind source separation for recorded audio mixture signals using an unmanned aerial vehicle
    Xie, Kan
    Jiang, Kanyang
    Yang, Qiyu
    IET COMMUNICATIONS, 2021, 15 (10) : 1412 - 1422
  • [5] Multi-channel source separation by factorial HMMs
    Reyes-Gomez, MJ
    Raj, B
    Ellis, DPW
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 664 - 667
  • [6] Decomposition and recognition of a multi-channel audio source using matching pursuit algorithm
    Bjornberg, DB
    Agili, S
    Morales, A
    2004 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL 5, PROCEEDINGS, 2004, : 624 - 627
  • [7] Multi-channel source separation preserving spatial information
    Aichner, Robert
    Buchner, Herbert
    Zourub, Meray
    Kellermann, Walter
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PTS 1-3, PROCEEDINGS, 2007, : 5 - 8
  • [8] Spatial Loss for Unsupervised Multi-channel Source Separation
    Saijo, Kohei
    Scheibler, Robin
    INTERSPEECH 2022, 2022, : 241 - 245
  • [9] AUDIO-VISUAL MULTI-CHANNEL SPEECH SEPARATION, DEREVERBERATION AND RECOGNITION
    Li, Guinan
    Yu, Jianwei
    Deng, Jiajun
    Liu, Xunying
    Meng, Helen
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6042 - 6046
  • [10] A multi-channel audio compression method with virtual source location information
    Moon, HG
    Seo, JI
    Beak, S
    Sung, KM
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2005, PT 1, 2005, 3767 : 742 - 753