Monaural Speech Separation Based on Gain Adapted Minimum Mean Square Error Estimation

被引:0
作者
M. H. Radfar
R. M. Dansereau
W.-Y. Chan
机构
[1] Carleton University,Department of Systems and Computer Engineering
[2] Queen’s University,Department of Electrical and Computer Engineering
来源
Journal of Signal Processing Systems | 2010年 / 61卷
关键词
Source separation; Model-based monaural speech separation; Minimum mean square error estimation; Gain adaptation; Mixmax approximation;
D O I
暂无
中图分类号
学科分类号
摘要
We present a new model-based monaural speech separation technique for separating two speech signals from a single recording of their mixture. This work is an attempt to solve a fundamental limitation in current model-based monaural speech separation techniques in which it is assumed that the data used in the training and test phases of the separation model have the same energy level. To overcome this limitation, a gain adapted minimum mean square error estimator is derived which estimates sources under different signal-to-signal ratios. Specifically, the speakers’ gains are incorporated as unknown parameters into the separation model and then the estimator is derived in terms of the source distributions and the signal-to-signal ratio. Experimental results show that the proposed system improves the separation performance significantly when compared with a similar model without gain adaptation as well as a maximum likelihood estimator with gain estimation.
引用
收藏
页码:21 / 37
页数:16
相关论文
共 70 条
[1]  
Radfar M. H.(2007)Single channel speech separation using soft mask filtering IEEE Transactions on Audio, Speech and Language Processing 15 2299-2310
[2]  
Dansereau R. M.(2007)Soft mask methods for single-channel speaker separation Audio, Speech and Language Processing, IEEE Transactions on 15 1766-1776
[3]  
Reddy A. M.(2007)A maximum likelihood estimation of vocal-tract-related filter characteristics for single channel speech separation EURASIP Journal on Audio, Speech, and Music Processing 2007 15-476
[4]  
Raj B.(2007)Monaural speech segregation based on fusion of source-driven with model-driven techniques Speech Communication 49 464-336
[5]  
Radfar M. H.(1994)Auditory scene analysis Computer Speech and Language 8 297-177
[6]  
Dansereau R. M.(2001)The auditory organization of speech and other sources in listeners and computational models Speech Communication 35 141-298
[7]  
Sayadiyan M. H.(1999)Using knowledge to organize sound: The prediction-driven approach to computational auditory scene analysis and its application to speech/nonspeech mixtures Speech Communication 27 281-222
[8]  
Radfar R. M.(1999)Harmonic sound stream segregation using localization and its application to speech stream segregation Speech Communication 27 209-697
[9]  
Dansereau A.(1999)Separation of speech from interfering sounds based on oscillatory correlation IEEE Transactions on Neural Networks 10 684-1150
[10]  
Sayadiyan G. J.(2004)Monaural speech segregation based on pitch tracking and amplitude modulation IEEE Transactions on Neural Networks 15 1135-437