Binaural Speech Separation Based on the Time-Frequency Binary Mask

被引:0
作者
Mahmoodzadeh, A. [1 ]
Abutalebi, H. R. [2 ]
Soltanian-Zadeh, H. [3 ,4 ]
Sheikhzadeh, H. [5 ]
机构
[1] Islamic Azad Univ, EE Dept, Fars Sci & Res Branch, Shiraz, Iran
[2] Yazd Univ, ECE Dept, Speech Proc Res, Shiraz, Iran
[3] Univ Tehran, Control & Intelligent Proc Ctr Excellence, Tehran, Iran
[4] Henry Ford Hlth, Image Anal Lab, Detroit, MI USA
[5] Amirkabir Univ Technol, Tehran, Iran
来源
2012 SIXTH INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS (IST) | 2012年
关键词
interaural intensity differences; interaural time differences; speech separation; time-frequency binary mask; BLIND SEPARATION; RECOGNITION; SIGNALS;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The perceptual ability of the human auditory system in capturing the target voice and filtering out the interferers has been remained as a great challenge. This paper proposes a binaural system for speech segregation based on spatial localization cues: Interaural Time Differences (ITD) and Interaural Intensity Differences (IID). A target speech signal is separated from interfering sounds by estimating time-frequency masks using the multi-level extension of the Otsu thresholding algorithm used in image segmentation. The ITD and IID are important features for mask estimation in low and high frequencies, respectively. A systematic evaluation in terms of Perceptual Evaluation of Speech Quality (PESQ) index shows that the resulting system yields significant improvement in performance of speech separation.
引用
收藏
页码:848 / 853
页数:6
相关论文
共 50 条
[41]   LATENT TIME-FREQUENCY COMPONENT ANALYSIS: A NOVEL PITCH-BASED APPROACH FOR SINGING VOICE SEPARATION [J].
Zhang, Xiu ;
Li, Wei ;
Zhu, Bilei .
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, :131-135
[42]   The importance of processing resolution in "ideal time-frequency segregation" of masked speech and the implications for predicting speech intelligibilitya) [J].
Conroy, Christopher ;
Best, Virginia ;
Jennings, Todd R. ;
Kidd, Gerald, Jr. .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2020, 147 (03) :1648-1660
[43]   The Application of Time-Frequency Masking To Improve Intelligibility of Dysarthric Speech in Background Noise [J].
Borrie, Stephanie A. ;
Yoho, Sarah E. ;
Healy, Eric W. ;
Barrett, Tyson S. .
JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2023, 66 (05) :1853-1866
[44]   Isolated Word Classification of Hearing Impaired Speech Using Time-Frequency Representations [J].
Goutham, Y. A. ;
Himasagar, T. S. ;
Karjigi, Veena ;
Chandrashekar, H. M. ;
Sreedevi, N. .
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2025,
[45]   Unsupervised Learning for Monaural Source Separation Using Maximization-Minimization Algorithm with Time-Frequency Deconvolution [J].
Woo, Wai Lok ;
Gao, Bin ;
Bouridane, Ahmed ;
Ling, Bingo Wing-Kuen ;
Chin, Cheng Siong .
SENSORS, 2018, 18 (05)
[46]   Speech Understanding Performance of Cochlear Implant Subjects Using Time-Frequency Masking-Based Noise Reduction [J].
Qazi, Obaid Ur Rehman ;
van Dijk, Bas ;
Moonen, Marc ;
Wouters, Jan .
IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2012, 59 (05) :1364-1373
[47]   An Eigen Based Feature on Time-Frequency Representation of EMG [J].
Sueaseenak, Direk ;
Pintavirooj, Chuchart ;
Sangworasil, Manas ;
Chanwimalueang, Theerasak ;
Praliwanon, Chaleeya .
2009 IEEE-RIVF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES: RESEARCH, INNOVATION AND VISION FOR THE FUTURE, 2009, :73-+
[48]   TIME-FREQUENCY RIDGE ANALYSIS BASED ON THE REASSIGNMENT VECTOR [J].
Meignen, S. ;
Gardner, T. ;
Oberlin, T. .
2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, :1486-1490
[49]   Time-Frequency Separation of Matched-Waveform Signatures of Coexisting Multimodal Systems [J].
Gattani, Vineet Sunil ;
Kota, John S. ;
Papandreou-Suppappola, Antonia .
2018 CONFERENCE RECORD OF 52ND ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2018, :2086-2090
[50]   A MLE-based blind signal separation method for time-frequency overlapped signal using neural network [J].
Pang, Lihui ;
Tang, Yilong ;
Tan, Qingyi ;
Liu, Yulang ;
Yang, Bin .
EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2022, 2022 (01)