Binaural Speech Separation Based on the Time-Frequency Binary Mask

被引:0
|
作者
Mahmoodzadeh, A. [1 ]
Abutalebi, H. R. [2 ]
Soltanian-Zadeh, H. [3 ,4 ]
Sheikhzadeh, H. [5 ]
机构
[1] Islamic Azad Univ, EE Dept, Fars Sci & Res Branch, Shiraz, Iran
[2] Yazd Univ, ECE Dept, Speech Proc Res, Shiraz, Iran
[3] Univ Tehran, Control & Intelligent Proc Ctr Excellence, Tehran, Iran
[4] Henry Ford Hlth, Image Anal Lab, Detroit, MI USA
[5] Amirkabir Univ Technol, Tehran, Iran
来源
2012 SIXTH INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS (IST) | 2012年
关键词
interaural intensity differences; interaural time differences; speech separation; time-frequency binary mask; BLIND SEPARATION; RECOGNITION; SIGNALS;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The perceptual ability of the human auditory system in capturing the target voice and filtering out the interferers has been remained as a great challenge. This paper proposes a binaural system for speech segregation based on spatial localization cues: Interaural Time Differences (ITD) and Interaural Intensity Differences (IID). A target speech signal is separated from interfering sounds by estimating time-frequency masks using the multi-level extension of the Otsu thresholding algorithm used in image segmentation. The ITD and IID are important features for mask estimation in low and high frequencies, respectively. A systematic evaluation in terms of Perceptual Evaluation of Speech Quality (PESQ) index shows that the resulting system yields significant improvement in performance of speech separation.
引用
收藏
页码:848 / 853
页数:6
相关论文
共 50 条
  • [1] SPATIAL AND COHERENCE CUES BASED TIME-FREQUENCY MASKING FOR BINAURAL REVERBERANT SPEECH SEPARATION
    Alinaghi, Atiyeh
    Wang, Wenwu
    Jackson, Philip J. B.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 684 - 688
  • [2] A new time-frequency binary mask estimation method based on convex optimization of speech power
    Bao, Feng
    Abdulla, Waleed H.
    SPEECH COMMUNICATION, 2018, 97 : 51 - 65
  • [3] Speech Enhancement in Low SNR Environments by Designing a Time-Frequency Binary Mask
    Cheng, Shuai
    Zhang, Haijian
    Hua, Guang
    2018 IEEE 23RD INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2018,
  • [4] Interaural Coherence Induced Ideal Binary Mask for Binaural Speech Separation and Dereverberation
    Chen, Yi-Ting
    Chen, Tzu-Hao
    Huang, Mao-Chang
    Chi, Tai-Shih
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [5] Variance based time-frequency mask estimation for unsupervised speech enhancement
    Nasir Saleem
    Muhammad Irfan Khattak
    Gunawan Witjaksono
    Gulzar Ahmad
    Multimedia Tools and Applications, 2019, 78 : 31867 - 31891
  • [6] Variance based time-frequency mask estimation for unsupervised speech enhancement
    Saleem, Nasir
    Khattak, Muhammad Irfan
    Witjaksono, Gunawan
    Ahmad, Gulzar
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (22) : 31867 - 31891
  • [7] Musical Sound Separation Based on Binary Time-Frequency Masking
    Li, Yipeng
    Wang, DeLiang
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2009,
  • [8] Musical Sound Separation Based on Binary Time-Frequency Masking
    Yipeng Li
    DeLiang Wang
    EURASIP Journal on Audio, Speech, and Music Processing, 2009
  • [9] Improving speech intelligibility in binaural hearing aids by estimating a time-frequency mask with a weighted least squares classifier
    Ayllon, David
    Gil-Pita, Roberto
    Rosa-Zurera, Manuel
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 191 - 195
  • [10] Speech mask estimation using the time-frequency correlation of speech presence
    Zhan, Ge
    Huang, Zhao-Qiong
    Ying, Dong-Wen
    Pan, Jie-Lin
    Yan, Yong-Hong
    Ruan Jian Xue Bao/Journal of Software, 2016, 27 : 64 - 68