Multi-Channel Audio Source Separation Using Multiple Deformed References

被引:19
|
作者
Souviraa-Labastic, Nathan [1 ]
Olivero, Anaik [1 ]
Vincent, Emmanuel [2 ]
Bimbot, Frederic [1 ]
机构
[1] Univ Rennes 1, CNRS, Inria, PANAMA,Project Team,IRISA, F-35000 Rennes, France
[2] Inria, F-54600 Villers Les Nancy, France
基金
欧洲研究理事会;
关键词
Generalized Expectation-Maximization (GEM) algorithm; source separation; NONNEGATIVE MATRIX FACTORIZATION; BLIND; INFORMATION; MODELS;
D O I
10.1109/TASLP.2015.2450494
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We present a general multi-channel source separation framework where additional audio references are available for one (or more) source(s) of a given mixture. Each audio reference is another mixture which is supposed to contain at least one source similar to one of the target sources. Deformations between the sources of interest and their references are modeled in a linear manner using a generic formulation. This is done by adding transformation matrices to an excitation-filter model, hence affecting different axes, namely frequency, dictionary component or time. A nonnegative matrix co-factorization algorithm and a generalized expectation-maximization algorithm are used to estimate the parameters of the model. Different model parameterizations and different combinations of algorithms are tested on music plus voice mixtures guided by music and/or voice references and on professionally-produced music recordings guided by cover references. Our algorithms improve the signal-to-distortion ratio (SDR) of the sources with the lowest intensity by 9 to 15 decibels (dB) with respect to original mixtures.
引用
收藏
页码:1775 / 1787
页数:13
相关论文
共 50 条
  • [41] ACOUSTIC MODEL COMBINATION TO COMPENSATE FOR RESIDUAL NOISE IN MULTI-CHANNEL SOURCE SEPARATION
    Yoon, Jae Sam
    Park, Ji Hun
    Kim, Hong Kook
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3925 - 3928
  • [42] Audio-Visual End-to-End Multi-Channel Speech Separation, Dereverberation and Recognition
    Li, Guinan
    Deng, Jiajun
    Geng, Mengzhe
    Jin, Zengrui
    Wang, Tianzi
    Hu, Shujie
    Cui, Mingyu
    Meng, Helen
    Liu, Xunying
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 2707 - 2723
  • [43] Multi-channel audio recovery based on tensor decomposition
    Yang, Li-Dong
    Wang, Jing
    Zhao, Yi
    Xie, Xiang
    Kuang, Jing-Ming
    Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology, 2015, 35 (11): : 1183 - 1188
  • [44] Digital multi-channel audio format for motion pictures
    Miyamori, S
    Ueno, M
    ICCE - INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, 1996 DIGEST OF TECHNICAL PAPERS, 1996, : 206 - 207
  • [45] A MULTI-CHANNEL FUSION FRAMEWORK FOR AUDIO EVENT DETECTION
    Huy Phan
    Maass, Marco
    Hertel, Lars
    Mazur, Radoslaw
    Mertins, Alfred
    2015 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2015,
  • [47] SPATIAL-TEMPORAL MULTI-CHANNEL AUDIO CODING
    Lee, Jonghwa
    Lee, Chulhee
    2008 IEEE SENSOR ARRAY AND MULTICHANNEL SIGNAL PROCESSING WORKSHOP, 2008, : 381 - 384
  • [48] AN ADAPTIVE MULTI-CHANNEL AUDIO-PLAY SYSTEM WITH SOUND-SOURCE RELOCATION CAPABILITIES
    Kim, K. H.
    Zhou, Tianran
    Park, Kyu-Shik
    Lee, Seok-Phil
    Lim, Tae-Beom
    2010 DIGEST OF TECHNICAL PAPERS INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS ICCE, 2010,
  • [49] MULTI-TASK AUDIO SOURCE SEPARATION
    Zhang, Lu
    Li, Chenxing
    Deng, Feng
    Wang, Xiaorui
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 671 - 678
  • [50] Multi-view video and multi-channel audio broadcasting system
    Oh, Kwan-Jung
    Kim, Manbae
    Yoon, Jae Sam
    Kim, Jongryool
    Park, Ilkwon
    Lee, Seungwon
    Lee, Cheon
    Heo, Jin
    Lee, Sang-Beom
    Park, Pil-Kyu
    Na, Sang-Tae
    Hyun, Myung-Han
    Kim, JongWon
    Byun, Hyeran
    Kim, Hong Kook
    Ho, Yo-Sung
    2007 3DTV CONFERENCE, 2007, : 165 - +