UNSUPERVISED MULTI-CHANNEL SEPARATION AND ADAPTATION

被引:0
|
作者
Han, Cong [1 ,2 ]
Wilson, Kevin [2 ]
Wisdom, Scott [2 ]
Hershey, John R. [2 ]
机构
[1] Columbia Univ, New York, NY 10027 USA
[2] Google, Mountain View, CA 94043 USA
关键词
multi-channel; speech separation;
D O I
10.1109/ICASSP48485.2024.10447422
中图分类号
学科分类号
摘要
A key challenge in machine learning is to generalize from training data to an application domain of interest. This work extends the recently-proposed mixture invariant training (MixIT) algorithm to perform unsupervised learning in the multi-channel setting. We use MixIT to train a model on far-field microphone array recordings of overlapping reverberant and noisy speech from the AMI Corpus. The models are trained on both supervised and unsupervised training data, and are tested on real AMI recordings containing overlapping speech. To objectively evaluate our models, we also use a synthetic multi-channel AMI test set. Holding network architectures constant, we find that semi-supervised fine-tuning of a model pretrained on a large and diverse single-channel dataset yields the largest improvement to SI-SNR and to human listening ratings across synthetic and real datasets, outperforming supervised models trained on well-matched synthetic data. Our results demonstrate that unsupervised learning through MixIT enables model adaptation on both single- and multi-channel real-world speech recordings.
引用
收藏
页码:721 / 725
页数:5
相关论文
共 50 条
  • [21] Frequency domain multi-channel speech separation and its applications
    Handa, M
    Nagai, T
    Kurematsu, A
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 2761 - 2764
  • [22] Multi-Channel Learning with Preprocessing for Automatic Modulation Order Separation
    Sumen, Gizem
    Celebi, Burak Ahmet
    Kurt, Gunes Karabulut
    Gorcin, Ali
    Basaran, Semiha Tedik
    2022 27TH IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (IEEE ISCC 2022), 2022,
  • [23] Integrating Spectral and Spatial Features for Multi-Channel Speaker Separation
    Wang, Zhong-Qiu
    Wang, DeLiang
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2718 - 2722
  • [24] Multi-channel U-Net for Music Source Separation
    Kadandale, Venkatesh S.
    Montesinos, Juan F.
    Haro, Gloria
    Gomez, Emilia
    2020 IEEE 22ND INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2020,
  • [25] A separation and interaction framework for causal multi-channel speech enhancement
    Liu, Wenzhe
    Li, Andong
    Zheng, Chengshi
    Li, Xiaodong
    DIGITAL SIGNAL PROCESSING, 2022, 126
  • [26] Multi-Channel Conversational Speaker Separation via Neural Diarization
    Taherian, Hassan
    Wang, DeLiang
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 2467 - 2476
  • [27] Improved Natural Gradient Algorithms for Multi-channel Signal Separation
    Hao, Cheng
    Na, Yu
    Jun, Liu
    MATERIAL SCIENCE, CIVIL ENGINEERING AND ARCHITECTURE SCIENCE, MECHANICAL ENGINEERING AND MANUFACTURING TECHNOLOGY II, 2014, 651-653 : 2326 - +
  • [28] Separation of harmonic instruments with overlapping partials in multi-channel mixtures
    Viste, H
    Evangelista, G
    2003 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS PROCEEDINGS, 2003, : 25 - 28
  • [29] Multi-channel capillaries and photonic crystal fibres for separation sciences
    Currivan, Sinead
    Upadhyay, Nirved
    Paull, Brett
    TRAC-TRENDS IN ANALYTICAL CHEMISTRY, 2018, 102 : 322 - 331
  • [30] Multi-channel source separation by beamforming trained with factorial HMMS
    Reyes-Gomez, MJ
    Ra, B
    Ellis, DPW
    2003 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS PROCEEDINGS, 2003, : 13 - 16