Blind Separation and Dereverberation of Speech Mixtures by Joint Optimization

被引:110
|
作者
Yoshioka, Takuya [1 ]
Nakatani, Tomohiro [1 ]
Miyoshi, Masato [2 ]
Okuno, Hiroshi G. [3 ]
机构
[1] NTT Corp, NTT Commun Sci Labs, Kyoto 6190237, Japan
[2] Kanazawa Univ, Grad Sch Nat Sci & Technol, Kanazawa, Ishikawa 9201192, Japan
[3] Kyoto Univ, Grad Sch Informat, Dept Intelligence Sci & Technol, Kyoto 6068501, Japan
关键词
Blind source separation (BSS); blind dereverberation (BD); conditional separation and dereverberation (CSD); CONVOLUTIVE MIXTURES; IDENTIFICATION; DECONVOLUTION; SUPPRESSION; ALGORITHMS; SIGNALS;
D O I
10.1109/TASL.2010.2045183
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes a method for performing blind source separation (BSS) and blind dereverberation (BD) at the same time for speech mixtures. In most previous studies, BSS and BD have been investigated separately. The separation performance of conventional BSS methods deteriorates as the reverberation time increases while many existing BD methods rely on the assumption that there is only one sound source in a room. Therefore, it has been difficult to perform both BSS and BD when the reverberation time is long. The proposed method uses a network, in which dereverberation and separation networks are connected in tandem, to estimate source signals. The parameters for the dereverberation network (prediction matrices) and those for the separation network (separation matrices) are jointly optimized. This enables a BD process to take a BSS process into account. The prediction and separation matrices are alternately optimized with each depending on the other; hence, we call the proposed method the conditional separation and dereverberation (CSD) method. Comprehensive evaluation results are reported, where all the speech materials contained in the complete test set of the TIMIT corpus are used. The CSD method improves the signal-to-interference ratio by an average of about 4 dB over the conventional frequency-domain BSS approach for reverberation times of 0.3 and 0.5 s. The direct-to-reverberation ratio is also improved by about 10 dB.
引用
收藏
页码:69 / 84
页数:16
相关论文
共 50 条
  • [21] Blind separation of speech mixtures based on nonstationarity
    Pham, DT
    Servière, C
    Boumaraf, H
    SEVENTH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOL 2, PROCEEDINGS, 2003, : 73 - 76
  • [22] Joint Online Multichannel Acoustic Echo Cancellation, Speech Dereverberation and Source Separation
    Na, Yueyue
    Wang, Ziteng
    Liu, Zhang
    Tian, Biao
    Fu, Qiang
    INTERSPEECH 2021, 2021, : 1144 - 1148
  • [23] Joint source separation and dereverberation using constrained spectral divergence optimization
    Nathwani, Karan
    Hegde, Rajesh M.
    SIGNAL PROCESSING, 2015, 106 : 266 - 281
  • [24] Blind speech separation using a joint model of speech production
    Smith, D
    Lukasiak, J
    Burnett, I
    IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (11) : 784 - 787
  • [25] Blind source separation algorithm for convolutive speech mixtures using joint block-diagonalization
    Xu, Shun
    Chen, Shao-Rong
    Liu, Yu-Lin
    Zhendong yu Chongji/Journal of Vibration and Shock, 2007, 26 (08): : 86 - 90
  • [26] A multistage approach to blind separation of convolutive speech mixtures
    Jan, Tariqullah
    Wang, Wenwu
    Wang, DeLiang
    SPEECH COMMUNICATION, 2011, 53 (04) : 524 - 539
  • [27] A Comprehensive Approach to Blind Source Separation of Speech Mixtures
    Zhao, Mengyi
    He, Zhiming
    2013 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), 2013, : 991 - 994
  • [28] A comprehensive approach to blind source separation of speech mixtures
    2013, Institute of Electrical and Electronics Engineers Inc., United States
  • [29] Blind multichannel identification for speech dereverberation and enhancement
    Yu, ZL
    Er, MH
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PROCEEDINGS: AUDIO AND ELECTROACOUSTICS SIGNAL PROCESSING FOR COMMUNICATIONS, 2004, : 105 - 108
  • [30] A MULTISTAGE APPROACH FOR BLIND SEPARATION OF CONVOLUTIVE SPEECH MIXTURES
    Jan, Tariqullah
    Wang, Wenwu
    Wang, DeLiang
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 1713 - +