UNSUPERVISED MULTI-CHANNEL SEPARATION AND ADAPTATION

被引:0
|
作者
Han, Cong [1 ,2 ]
Wilson, Kevin [2 ]
Wisdom, Scott [2 ]
Hershey, John R. [2 ]
机构
[1] Columbia Univ, New York, NY 10027 USA
[2] Google, Mountain View, CA 94043 USA
关键词
multi-channel; speech separation;
D O I
10.1109/ICASSP48485.2024.10447422
中图分类号
学科分类号
摘要
A key challenge in machine learning is to generalize from training data to an application domain of interest. This work extends the recently-proposed mixture invariant training (MixIT) algorithm to perform unsupervised learning in the multi-channel setting. We use MixIT to train a model on far-field microphone array recordings of overlapping reverberant and noisy speech from the AMI Corpus. The models are trained on both supervised and unsupervised training data, and are tested on real AMI recordings containing overlapping speech. To objectively evaluate our models, we also use a synthetic multi-channel AMI test set. Holding network architectures constant, we find that semi-supervised fine-tuning of a model pretrained on a large and diverse single-channel dataset yields the largest improvement to SI-SNR and to human listening ratings across synthetic and real datasets, outperforming supervised models trained on well-matched synthetic data. Our results demonstrate that unsupervised learning through MixIT enables model adaptation on both single- and multi-channel real-world speech recordings.
引用
收藏
页码:721 / 725
页数:5
相关论文
共 50 条
  • [1] Spatial Loss for Unsupervised Multi-channel Source Separation
    Saijo, Kohei
    Scheibler, Robin
    INTERSPEECH 2022, 2022, : 241 - 245
  • [2] Separation of multi-channel spinal cord recordings using unsupervised adaptive filtering
    Tie, YM
    Sahin, M
    SECOND JOINT EMBS-BMES CONFERENCE 2002, VOLS 1-3, CONFERENCE PROCEEDINGS: BIOENGINEERING - INTEGRATIVE METHODOLOGIES, NEW TECHNOLOGIES, 2002, : 2014 - 2015
  • [3] Mentoring-Reverse Mentoring for Unsupervised Multi-channel Speech Source Separation
    Nakagome, Yu
    Togami, Masahito
    Ogawa, Tetsuji
    Kobayashi, Tetsunori
    INTERSPEECH 2020, 2020, : 86 - 90
  • [4] Multi-channel signal separation
    Chan, DCB
    Rayner, PJW
    Godsill, SJ
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 649 - 652
  • [5] Multi-channel interference separation for the AWGN channel
    Jong, GJ
    Liao, PJ
    Jung, CY
    Su, TJ
    ISPACS 2005: PROCEEDINGS OF THE 2005 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS, 2005, : 581 - 584
  • [6] Multi-Channel Signal Separation by Decorrelation
    Weinstein, Ehud
    Feder, Meir
    Oppenheim, Alan V.
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1993, 1 (04): : 405 - 413
  • [7] Channel width adaptation algorithm in multi-channel vehicular networks
    Peng, Xin
    Li, Ren-Fa
    Liu, Liang-Jiao
    Tongxin Xuebao/Journal on Communications, 2010, 31 (11): : 123 - 129
  • [8] Iteratively Refined Multi-Channel Speech Separation
    Zhang, Xu
    Bao, Changchun
    Yang, Xue
    Zhou, Jing
    APPLIED SCIENCES-BASEL, 2024, 14 (14):
  • [9] Multi-channel source separation by factorial HMMs
    Reyes-Gomez, MJ
    Raj, B
    Ellis, DPW
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 664 - 667
  • [10] Multi-Modal Multi-Channel Target Speech Separation
    Gu, Rongzhi
    Zhang, Shi-Xiong
    Xu, Yong
    Chen, Lianwu
    Zou, Yuexian
    Yu, Dong
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2020, 14 (03) : 530 - 541