Binaural Speech Separation Based on the Time-Frequency Binary Mask

被引：0

作者：

Mahmoodzadeh, A. ^{[1
]}

Abutalebi, H. R. ^{[2
]}

Soltanian-Zadeh, H. ^{[3
,4
]}

Sheikhzadeh, H. ^{[5
]}

机构：

[1] Islamic Azad Univ, EE Dept, Fars Sci & Res Branch, Shiraz, Iran

[2] Yazd Univ, ECE Dept, Speech Proc Res, Shiraz, Iran

[3] Univ Tehran, Control & Intelligent Proc Ctr Excellence, Tehran, Iran

[4] Henry Ford Hlth, Image Anal Lab, Detroit, MI USA

[5] Amirkabir Univ Technol, Tehran, Iran

来源：

2012 SIXTH INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS (IST) | 2012年

关键词：

interaural intensity differences; interaural time differences; speech separation; time-frequency binary mask; BLIND SEPARATION; RECOGNITION; SIGNALS;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The perceptual ability of the human auditory system in capturing the target voice and filtering out the interferers has been remained as a great challenge. This paper proposes a binaural system for speech segregation based on spatial localization cues: Interaural Time Differences (ITD) and Interaural Intensity Differences (IID). A target speech signal is separated from interfering sounds by estimating time-frequency masks using the multi-level extension of the Otsu thresholding algorithm used in image segmentation. The ITD and IID are important features for mask estimation in low and high frequencies, respectively. A systematic evaluation in terms of Perceptual Evaluation of Speech Quality (PESQ) index shows that the resulting system yields significant improvement in performance of speech separation.

引用

页码：848 / 853

页数：6

共 50 条

[1] SPATIAL AND COHERENCE CUES BASED TIME-FREQUENCY MASKING FOR BINAURAL REVERBERANT SPEECH SEPARATION
Alinaghi, Atiyeh
Wang, Wenwu
Jackson, Philip J. B.
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 684 - 688
[2] A new time-frequency binary mask estimation method based on convex optimization of speech power
Bao, Feng
Abdulla, Waleed H.
SPEECH COMMUNICATION, 2018, 97 : 51 - 65
[3] Speech Enhancement in Low SNR Environments by Designing a Time-Frequency Binary Mask
Cheng, Shuai
Zhang, Haijian
Hua, Guang
2018 IEEE 23RD INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2018,
[4] Interaural Coherence Induced Ideal Binary Mask for Binaural Speech Separation and Dereverberation
Chen, Yi-Ting
Chen, Tzu-Hao
Huang, Mao-Chang
Chi, Tai-Shih
2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
[5] Variance based time-frequency mask estimation for unsupervised speech enhancement
Nasir Saleem
Muhammad Irfan Khattak
Gunawan Witjaksono
Gulzar Ahmad
Multimedia Tools and Applications, 2019, 78 : 31867 - 31891
[6] Variance based time-frequency mask estimation for unsupervised speech enhancement
Saleem, Nasir
Khattak, Muhammad Irfan
Witjaksono, Gunawan
Ahmad, Gulzar
MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (22) : 31867 - 31891
[7] Musical Sound Separation Based on Binary Time-Frequency Masking
Li, Yipeng
Wang, DeLiang
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2009,
[8] Musical Sound Separation Based on Binary Time-Frequency Masking
Yipeng Li
DeLiang Wang
EURASIP Journal on Audio, Speech, and Music Processing, 2009
[9] Improving speech intelligibility in binaural hearing aids by estimating a time-frequency mask with a weighted least squares classifier
Ayllon, David
Gil-Pita, Roberto
Rosa-Zurera, Manuel
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 191 - 195
[10] Speech mask estimation using the time-frequency correlation of speech presence
Zhan, Ge
Huang, Zhao-Qiong
Ying, Dong-Wen
Pan, Jie-Lin
Yan, Yong-Hong
Ruan Jian Xue Bao/Journal of Software, 2016, 27 : 64 - 68

← 1 2 3 4 5 →