A FEATURE STUDY FOR CLASSIFICATION-BASED SPEECH SEPARATION AT VERY LOW SIGNAL-TO-NOISE RATIO

被引:0
|
作者
Chen, Jitong [1 ]
Wang, Yuxuan [1 ]
Wang, DeLiang [1 ]
机构
[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
来源
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2014年
关键词
Speech separation; classification; multi-resolution cochleagram; ARMA filtering; RECOGNITION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech separation is a challenging problem at low signal-to-noise ratios (SNRs). Separation can be formulated as a classification problem. In this study, we focus on the SNR level of -5 dB in which speech is generally dominated by background noise. In such a low SNR condition, extracting robust features from a noisy mixture is crucial for successful classification. Using a common neural network classifier, we systematically compare separation performance of many monaural features. In addition, we propose a new feature called Multi-Resolution Cochleagram (MRCG), which is extracted from four cochleagrams of different resolutions to capture both local information and spectrotemporal context. Comparisons using two non-stationary noises show a range of feature robustness for speech separation with the proposed MRCG performing the best. We also find that ARMA filtering, a post-processing technique previously used for robust speech recognition, improves speech separation performance by smoothing the temporal trajectories of feature dimensions.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] A Feature Study for Classification-Based Speech Separation at Low Signal-to-Noise Ratios
    Chen, Jitong
    Wang, Yuxuan
    Wang, DeLiang
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (12) : 1993 - 2002
  • [2] Denoising method for Raman spectra with low signal-to-noise ratio based on feature extraction
    Zhao, X. Y.
    Liu, G. Y.
    Sui, Y. T.
    Xu, M.
    Tong, L.
    SPECTROCHIMICA ACTA PART A-MOLECULAR AND BIOMOLECULAR SPECTROSCOPY, 2021, 250
  • [3] An approach to ARMA system identification at a very low signal-to-noise ratio
    Fattah, SA
    Zhu, WP
    Ahmad, MO
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 113 - 116
  • [4] A pitch detection method for speech signals with low signal-to-noise ratio
    Shahnaz, C.
    Zhu, W. -P.
    Ahmad, M. O.
    2007 INTERNATIONAL SYMPOSIUM ON SIGNALS, SYSTEMS AND ELECTRONICS, VOLS 1 AND 2, 2007, : 386 - 389
  • [5] The optimal ratio time-frequency mask for speech separation in terms of the signal-to-noise ratio
    Liang, Shan
    Liu, Wenju
    Jiang, Wei
    Xue, Wei
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2013, 134 (05): : EL452 - EL458
  • [6] Information Approach to Signal-to-Noise Ratio Estimation of the Speech Signal
    Gai, Vasiliy
    INFORMATION TECHNOLOGIES AND MATHEMATICAL MODELLING, 2014, 487 : 137 - 144
  • [7] A formant frequency estimation algorithm for speech signals with low signal-to-noise ratio
    Fattah, S. A.
    Zhu, W. -P.
    Ahmad, M. O.
    2007 50TH MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-3, 2007, : 81 - 84
  • [8] An effective pitch detection method for speech signals with low signal-to-noise ratio
    Zhao, Zhen-Dong
    Hu, Xi-Mei
    Tian, Jing-Feng
    PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 2775 - +
  • [9] THE MEASUREMENT OF THE SIGNAL-TO-NOISE RATIO (SNR) IN CONTINUOUS SPEECH
    KLINGHOLZ, F
    SPEECH COMMUNICATION, 1987, 6 (01) : 15 - 26
  • [10] SIGNAL-TO-NOISE RATIO AS A PREDICTOR OF SPEECH TRANSMISSION QUALITY
    SEN, TK
    CARROLL, JD
    IEEE TRANSACTIONS ON AUDIO AND ELECTROACOUSTICS, 1973, AU21 (04): : 384 - 387