A FEATURE STUDY FOR CLASSIFICATION-BASED SPEECH SEPARATION AT VERY LOW SIGNAL-TO-NOISE RATIO

被引：0

作者：

Chen, Jitong ^{[1
]}

Wang, Yuxuan ^{[1
]}

Wang, DeLiang ^{[1
]}

机构：

[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA

来源：

2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2014年

关键词：

Speech separation; classification; multi-resolution cochleagram; ARMA filtering; RECOGNITION;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Speech separation is a challenging problem at low signal-to-noise ratios (SNRs). Separation can be formulated as a classification problem. In this study, we focus on the SNR level of -5 dB in which speech is generally dominated by background noise. In such a low SNR condition, extracting robust features from a noisy mixture is crucial for successful classification. Using a common neural network classifier, we systematically compare separation performance of many monaural features. In addition, we propose a new feature called Multi-Resolution Cochleagram (MRCG), which is extracted from four cochleagrams of different resolutions to capture both local information and spectrotemporal context. Comparisons using two non-stationary noises show a range of feature robustness for speech separation with the proposed MRCG performing the best. We also find that ARMA filtering, a post-processing technique previously used for robust speech recognition, improves speech separation performance by smoothing the temporal trajectories of feature dimensions.

引用

页数：5

共 50 条

[1] A Feature Study for Classification-Based Speech Separation at Low Signal-to-Noise Ratios
Chen, Jitong
Wang, Yuxuan
Wang, DeLiang
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (12) : 1993 - 2002
[2] Denoising method for Raman spectra with low signal-to-noise ratio based on feature extraction
Zhao, X. Y.
Liu, G. Y.
Sui, Y. T.
Xu, M.
Tong, L.
SPECTROCHIMICA ACTA PART A-MOLECULAR AND BIOMOLECULAR SPECTROSCOPY, 2021, 250
[3] An approach to ARMA system identification at a very low signal-to-noise ratio
Fattah, SA
Zhu, WP
Ahmad, MO
2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 113 - 116
[4] A pitch detection method for speech signals with low signal-to-noise ratio
Shahnaz, C.
Zhu, W. -P.
Ahmad, M. O.
2007 INTERNATIONAL SYMPOSIUM ON SIGNALS, SYSTEMS AND ELECTRONICS, VOLS 1 AND 2, 2007, : 386 - 389
[5] The optimal ratio time-frequency mask for speech separation in terms of the signal-to-noise ratio
Liang, Shan
Liu, Wenju
Jiang, Wei
Xue, Wei
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2013, 134 (05): : EL452 - EL458
[6] Information Approach to Signal-to-Noise Ratio Estimation of the Speech Signal
Gai, Vasiliy
INFORMATION TECHNOLOGIES AND MATHEMATICAL MODELLING, 2014, 487 : 137 - 144
[7] A formant frequency estimation algorithm for speech signals with low signal-to-noise ratio
Fattah, S. A.
Zhu, W. -P.
Ahmad, M. O.
2007 50TH MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-3, 2007, : 81 - 84
[8] An effective pitch detection method for speech signals with low signal-to-noise ratio
Zhao, Zhen-Dong
Hu, Xi-Mei
Tian, Jing-Feng
PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 2775 - +
[9] THE MEASUREMENT OF THE SIGNAL-TO-NOISE RATIO (SNR) IN CONTINUOUS SPEECH
KLINGHOLZ, F
SPEECH COMMUNICATION, 1987, 6 (01) : 15 - 26
[10] SIGNAL-TO-NOISE RATIO AS A PREDICTOR OF SPEECH TRANSMISSION QUALITY
SEN, TK
CARROLL, JD
IEEE TRANSACTIONS ON AUDIO AND ELECTROACOUSTICS, 1973, AU21 (04): : 384 - 387

← 1 2 3 4 5 →