Isolating the energetic com ponent of speech-on-speech masking with ideal time-frequency segregation

被引:308
|
作者
Brungart, Douglas S.
Chang, Peter S.
Simpson, Brian D.
Wang, DeLiang
机构
[1] USAF, Res Lab, Human Effectiveness Directorate, Wright Patterson AFB, OH 45433 USA
[2] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
[3] Ohio State Univ, Ctr Cognit Sci, Columbus, OH 43210 USA
来源
关键词
D O I
10.1121/1.2363929
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
When a target speech signal is obscured by an interfering speech wave form, comprehension of the target message depends both on the successful detection of the energy from the target speech wave form and on the successful extraction and recognition of the spectro-temporal energy pattern of the target out of a background of acoustically similar masker sounds. This study attempted to isolate the effects that energetic masking, defined as the loss of detectable target information due to the spectral overlap of the target and masking signals, has on multitalker speech perception. This was achieved through the use of ideal time-frequency binary masks that retained those spectro-temporal regions of the acoustic mixture that were dominated by the target speech but eliminated those regions that were dominated by the interfering speech. The results suggest that energetic masking plays a relatively small role in the overall masking that occurs when speech is masked by interfering speech but a much more significant role when speech is masked by interfering noise. (c) 2006 Acoustical Society of America.
引用
收藏
页码:4007 / 4018
页数:12
相关论文
共 50 条
  • [41] TIME-FREQUENCY MASKING-BASED SPEECH ENHANCEMENT USING GENERATIVE ADVERSARIAL NETWORK
    Soni, Meet H.
    Shah, Neil
    Patil, Hemant A.
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5039 - 5043
  • [42] ACOUSTIC VECTOR SENSOR BASED REVERBERANT SPEECH SEPARATION WITH PROBABILISTIC TIME-FREQUENCY MASKING
    Zhong, Xionghu
    Chen, Xiaoyi
    Wang, Wenwu
    Alinaghi, Atiyeh
    Premkumar, A. B.
    2013 PROCEEDINGS OF THE 21ST EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2013,
  • [43] Robust Automatic Speech Recognition System Based on Using Adaptive Time-Frequency Masking
    Gouda, Ahmed Mostafa
    Tamazin, Mohamed
    Khedr, Mohamed
    PROCEEDINGS OF 2016 11TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS (ICCES), 2016, : 181 - 186
  • [44] Speech mask estimation using the time-frequency correlation of speech presence
    Zhan, Ge
    Huang, Zhao-Qiong
    Ying, Dong-Wen
    Pan, Jie-Lin
    Yan, Yong-Hong
    Ruan Jian Xue Bao/Journal of Software, 2016, 27 : 64 - 68
  • [45] PHASE TIME-FREQUENCY MASKING BASED SPEECH ENHANCEMENT ALGORITHM USING CIRCULAR MICROPHONE ARRAY
    He, Li
    Zhou, Yi
    Liu, Hongqing
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 808 - 813
  • [46] Impact of phase estimation on single-channel speech separation based on time-frequency masking
    Mayer, Florian
    Williamson, Donald S.
    Mowlaee, Pejman
    Wang, DeLiang
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2017, 141 (06): : 4668 - 4679
  • [47] Speech segregation by representation of modulation in time or frequency?
    Meyer, GF
    Berthommier, F
    BRITISH JOURNAL OF AUDIOLOGY, 1997, 31 (02): : 108 - 110
  • [48] On-line Speech Enhancement by Time-Frequency Masking under Prior Knowledge of Source Location
    Kang, Min Ah
    Jeong, Sangbae
    Hahn, Minsoo
    PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 25, 2007, 25 : 116 - 121
  • [49] Online blind speech separation using multiple acoustic speaker tracking and time-frequency masking
    Pertila, P.
    COMPUTER SPEECH AND LANGUAGE, 2013, 27 (03): : 683 - 702
  • [50] Watermarking of speech signals in the time-frequency domain
    Al-Khassaweneh, Mahmood
    Al-Zoubi, Hussein
    Aviyente, Selin
    2009 IEEE INTERNATIONAL CONFERENCE ON ELECTRO/INFORMATION TECHNOLOGY, 2009, : 317 - +