UNSUPERVISED MULTI-CHANNEL SEPARATION AND ADAPTATION

被引:0
作者
Han, Cong [1 ,2 ]
Wilson, Kevin [2 ]
Wisdom, Scott [2 ]
Hershey, John R. [2 ]
机构
[1] Columbia Univ, New York, NY 10027 USA
[2] Google, Mountain View, CA 94043 USA
来源
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024 | 2024年
关键词
multi-channel; speech separation;
D O I
10.1109/ICASSP48485.2024.10447422
中图分类号
学科分类号
摘要
A key challenge in machine learning is to generalize from training data to an application domain of interest. This work extends the recently-proposed mixture invariant training (MixIT) algorithm to perform unsupervised learning in the multi-channel setting. We use MixIT to train a model on far-field microphone array recordings of overlapping reverberant and noisy speech from the AMI Corpus. The models are trained on both supervised and unsupervised training data, and are tested on real AMI recordings containing overlapping speech. To objectively evaluate our models, we also use a synthetic multi-channel AMI test set. Holding network architectures constant, we find that semi-supervised fine-tuning of a model pretrained on a large and diverse single-channel dataset yields the largest improvement to SI-SNR and to human listening ratings across synthetic and real datasets, outperforming supervised models trained on well-matched synthetic data. Our results demonstrate that unsupervised learning through MixIT enables model adaptation on both single- and multi-channel real-world speech recordings.
引用
收藏
页码:721 / 725
页数:5
相关论文
共 50 条
  • [11] Multi-Channel Multi-Frame ADL-MVDR for Target Speech Separation
    Zhang, Zhuohuang
    Xu, Yong
    Yu, Meng
    Zhang, Shi-Xiong
    Chen, Lianwu
    Williamson, Donald S.
    Yu, Dong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3526 - 3540
  • [12] Time-frequency Domain Filter-and-sum Network for Multi-channel Speech Separation
    Deng, Zhewen
    Zhou, Yi
    Liu, Hongqing
    INTERSPEECH 2023, 2023, : 3689 - 3693
  • [13] Implicit Filter-and-sum Network for End-to-end Multi-channel Speech Separation
    Luo, Yi
    Mesgarani, Nima
    INTERSPEECH 2021, 2021, : 3071 - 3075
  • [14] AUDIO-VISUAL MULTI-CHANNEL SPEECH SEPARATION, DEREVERBERATION AND RECOGNITION
    Li, Guinan
    Yu, Jianwei
    Deng, Jiajun
    Liu, Xunying
    Meng, Helen
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6042 - 6046
  • [15] Ad hoc Networks Multi-Channel MAC Protocol Design and Channel Width Adaptation Technology
    Wang, Fucai
    Zhao, Haitao
    Song, An
    Shi, Chunguang
    2011 7TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING (WICOM), 2011,
  • [16] Joint Channel Width Adaptation, Topology Control, and Routing for Multi-Radio Multi-Channel Wireless Mesh Networks
    Li, Li
    Zhang, Chunyuan
    2009 6TH IEEE CONSUMER COMMUNICATIONS AND NETWORKING CONFERENCE, VOLS 1 AND 2, 2009, : 459 - 463
  • [17] Overlapped Sound Event Classification via Multi-Channel Sound Separation Network
    Giannoulis, Panagiotis
    Potamianos, Gerasimos
    Maragos, Petros
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 571 - 575
  • [18] 3D SPATIAL FEATURES FOR MULTI-CHANNEL TARGET SPEECH SEPARATION
    Gu, Rongzhi
    Zhang, Shi-Xiong
    Yu, Meng
    Yu, Dong
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 996 - 1002
  • [19] MULTI-CHANNEL NARROW-BAND DEEP SPEECH SEPARATION WITH FULL-BAND PERMUTATION INVARIANT TRAINING
    Quan, Changsheng
    Li, Xiaofei
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 541 - 545
  • [20] Multi-channel deflection crossbar (MCDC): A VLSI optimized architecture for multi-channel ATM switching
    Yan, PY
    Kim, KS
    Min, PS
    Hegde, MV
    IEEE INFOCOM '97 - THE CONFERENCE ON COMPUTER COMMUNICATIONS, PROCEEDINGS, VOLS 1-3: SIXTEENTH ANNUAL JOINT CONFERENCE OF THE IEEE COMPUTER AND COMMUNICATIONS SOCIETIES - DRIVING THE INFORMATION REVOLUTION, 1997, : 12 - 19