Binaural Speech Separation Based on the Time-Frequency Binary Mask

被引:0
作者
Mahmoodzadeh, A. [1 ]
Abutalebi, H. R. [2 ]
Soltanian-Zadeh, H. [3 ,4 ]
Sheikhzadeh, H. [5 ]
机构
[1] Islamic Azad Univ, EE Dept, Fars Sci & Res Branch, Shiraz, Iran
[2] Yazd Univ, ECE Dept, Speech Proc Res, Shiraz, Iran
[3] Univ Tehran, Control & Intelligent Proc Ctr Excellence, Tehran, Iran
[4] Henry Ford Hlth, Image Anal Lab, Detroit, MI USA
[5] Amirkabir Univ Technol, Tehran, Iran
来源
2012 SIXTH INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS (IST) | 2012年
关键词
interaural intensity differences; interaural time differences; speech separation; time-frequency binary mask; BLIND SEPARATION; RECOGNITION; SIGNALS;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The perceptual ability of the human auditory system in capturing the target voice and filtering out the interferers has been remained as a great challenge. This paper proposes a binaural system for speech segregation based on spatial localization cues: Interaural Time Differences (ITD) and Interaural Intensity Differences (IID). A target speech signal is separated from interfering sounds by estimating time-frequency masks using the multi-level extension of the Otsu thresholding algorithm used in image segmentation. The ITD and IID are important features for mask estimation in low and high frequencies, respectively. A systematic evaluation in terms of Perceptual Evaluation of Speech Quality (PESQ) index shows that the resulting system yields significant improvement in performance of speech separation.
引用
收藏
页码:848 / 853
页数:6
相关论文
共 50 条
  • [21] Time-frequency Domain Filter-and-sum Network for Multi-channel Speech Separation
    Deng, Zhewen
    Zhou, Yi
    Liu, Hongqing
    [J]. INTERSPEECH 2023, 2023, : 3689 - 3693
  • [22] Informed Single Channel Speech Separation With Time-Frequency Exemplar GMM-HMM Model
    Wang, Qi
    Woo, W. L.
    Dlay, S. S.
    Chin, C. S.
    Gao, Bin
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2015, : 1130 - 1134
  • [23] Extraction of Expression from Japanese Speech based on Time-Frequency and Fractal Features
    Phothisonothai, Montri
    Arita, Yasunori
    Watanabe, Katsumi
    [J]. 2013 10TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY (ECTI-CON), 2013,
  • [24] Adaptive time-frequency data fusion for speech enhancement
    Shi, G
    Aarabi, P
    Lazic, N
    [J]. FUSION 2003: PROCEEDINGS OF THE SIXTH INTERNATIONAL CONFERENCE OF INFORMATION FUSION, VOLS 1 AND 2, 2003, : 394 - 399
  • [25] Time-Frequency Mask-Aware Bidirectional LSTM: A Deep Learning Approach for Underwater Acoustic Signal Separation
    Chen, Jie
    Liu, Chang
    Xie, Jiawu
    An, Jie
    Huang, Nan
    [J]. SENSORS, 2022, 22 (15)
  • [26] Blind Separation of Synchronous-networking Frequency Hopping Signals Based on Time-frequency Analysis
    Lei, Ziwei
    Zheng, Linhua
    Ding, Hong
    Liu, Haibin
    Liu, Yongyong
    [J]. 9TH INTERNATIONAL CONFERENCE ON FUTURE NETWORKS AND COMMUNICATIONS (FNC'14) / THE 11TH INTERNATIONAL CONFERENCE ON MOBILE SYSTEMS AND PERVASIVE COMPUTING (MOBISPC'14) / AFFILIATED WORKSHOPS, 2014, 34 : 31 - 38
  • [27] End-to-End Speech Separation Using Orthogonal Representation in Complex and Real Time-Frequency Domain
    Wang, Kai
    Huang, Hao
    Hu, Ying
    Huang, Zhihua
    Li, Sheng
    [J]. INTERSPEECH 2021, 2021, : 3046 - 3050
  • [28] Blind Separation of Radar Signals Based on Time-Frequency Analysis of Short Time Fourier Transformation
    Cheng Xu-De
    Xue Xue-Dong
    Xu Bing
    Zheng Yuan
    Wang Pin
    [J]. PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON ENGINEERING SCIENCE AND MANAGEMENT (ESM), 2016, 62 : 50 - 53
  • [29] Measuring time-frequency importance functions of speech with bubble noise
    Mandel, Michael I.
    Yoho, Sarah E.
    Healy, Eric W.
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2016, 140 (04) : 2542 - 2553
  • [30] Time-Frequency Masking in the Complex Domain for Speech Dereverberation and Denoising
    Williamson, Donald S.
    Wang, DeLiang
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (07) : 1492 - 1501