Binaural Speech Separation Based on the Time-Frequency Binary Mask

被引:0
作者
Mahmoodzadeh, A. [1 ]
Abutalebi, H. R. [2 ]
Soltanian-Zadeh, H. [3 ,4 ]
Sheikhzadeh, H. [5 ]
机构
[1] Islamic Azad Univ, EE Dept, Fars Sci & Res Branch, Shiraz, Iran
[2] Yazd Univ, ECE Dept, Speech Proc Res, Shiraz, Iran
[3] Univ Tehran, Control & Intelligent Proc Ctr Excellence, Tehran, Iran
[4] Henry Ford Hlth, Image Anal Lab, Detroit, MI USA
[5] Amirkabir Univ Technol, Tehran, Iran
来源
2012 SIXTH INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS (IST) | 2012年
关键词
interaural intensity differences; interaural time differences; speech separation; time-frequency binary mask; BLIND SEPARATION; RECOGNITION; SIGNALS;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The perceptual ability of the human auditory system in capturing the target voice and filtering out the interferers has been remained as a great challenge. This paper proposes a binaural system for speech segregation based on spatial localization cues: Interaural Time Differences (ITD) and Interaural Intensity Differences (IID). A target speech signal is separated from interfering sounds by estimating time-frequency masks using the multi-level extension of the Otsu thresholding algorithm used in image segmentation. The ITD and IID are important features for mask estimation in low and high frequencies, respectively. A systematic evaluation in terms of Perceptual Evaluation of Speech Quality (PESQ) index shows that the resulting system yields significant improvement in performance of speech separation.
引用
收藏
页码:848 / 853
页数:6
相关论文
共 50 条
[21]   AUDIO SOURCE SEPARATION WITH TIME-FREQUENCY VELOCITIES [J].
Wolf, Guy ;
Mallat, Stephane ;
Shamma, Shihab .
2014 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2014,
[22]   Time-frequency Domain Filter-and-sum Network for Multi-channel Speech Separation [J].
Deng, Zhewen ;
Zhou, Yi ;
Liu, Hongqing .
INTERSPEECH 2023, 2023, :3689-3693
[23]   Time-Frequency Filtering Based on Model Fitting in the Time-Frequency Plane [J].
Colominas, Marcelo A. ;
Meignen, Sylvain ;
Duong-Hung Pham .
IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (05) :660-664
[24]   Extraction of Expression from Japanese Speech based on Time-Frequency and Fractal Features [J].
Phothisonothai, Montri ;
Arita, Yasunori ;
Watanabe, Katsumi .
2013 10TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY (ECTI-CON), 2013,
[25]   Adaptive time-frequency data fusion for speech enhancement [J].
Shi, G ;
Aarabi, P ;
Lazic, N .
FUSION 2003: PROCEEDINGS OF THE SIXTH INTERNATIONAL CONFERENCE OF INFORMATION FUSION, VOLS 1 AND 2, 2003, :394-399
[26]   Time-Frequency Mask-Aware Bidirectional LSTM: A Deep Learning Approach for Underwater Acoustic Signal Separation [J].
Chen, Jie ;
Liu, Chang ;
Xie, Jiawu ;
An, Jie ;
Huang, Nan .
SENSORS, 2022, 22 (15)
[27]   End-to-End Speech Separation Using Orthogonal Representation in Complex and Real Time-Frequency Domain [J].
Wang, Kai ;
Huang, Hao ;
Hu, Ying ;
Huang, Zhihua ;
Li, Sheng .
INTERSPEECH 2021, 2021, :3046-3050
[28]   Blind Separation of Synchronous-networking Frequency Hopping Signals Based on Time-frequency Analysis [J].
Lei, Ziwei ;
Zheng, Linhua ;
Ding, Hong ;
Liu, Haibin ;
Liu, Yongyong .
9TH INTERNATIONAL CONFERENCE ON FUTURE NETWORKS AND COMMUNICATIONS (FNC'14) / THE 11TH INTERNATIONAL CONFERENCE ON MOBILE SYSTEMS AND PERVASIVE COMPUTING (MOBISPC'14) / AFFILIATED WORKSHOPS, 2014, 34 :31-38
[29]   Blind Separation of Radar Signals Based on Time-Frequency Analysis of Short Time Fourier Transformation [J].
Cheng Xu-De ;
Xue Xue-Dong ;
Xu Bing ;
Zheng Yuan ;
Wang Pin .
PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON ENGINEERING SCIENCE AND MANAGEMENT (ESM), 2016, 62 :50-53
[30]   Measuring time-frequency importance functions of speech with bubble noise [J].
Mandel, Michael I. ;
Yoho, Sarah E. ;
Healy, Eric W. .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2016, 140 (04) :2542-2553