Noise Perturbation Improves Supervised Speech Separation

被引：9

作者：

Chen, Jitong ^{[1
]}

Wang, Yuxuan ^{[1
]}

Wang, DeLiang ^{[1
]}

机构：

[1] Ohio State Univ, Columbus, OH 43210 USA

来源：

LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION, LVA/ICA 2015 | 2015年 / 9237卷

关键词：

Speech separation; Supervised learning; Noise perturbation; INTELLIGIBILITY; ALGORITHM;

D O I：

10.1007/978-3-319-22482-4_10

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Speech separation can be treated as a mask estimation problem where interference-dominant portions are masked in a time-frequency representation of noisy speech. In supervised speech separation, a classifier is typically trained on a mixture set of speech and noise. Improving the generalization of a classifier is challenging, especially when interfering noise is strong and nonstationary. Expansion of a noise through proper perturbation during training exposes the classifier to more noise variations, and hence may improve separation performance. In this study, we examine the effects of three noise perturbations at low signal-to-noise ratios (SNRs). We evaluate speech separation performance in terms of hit minus false-alarm rate and short-time objective intelligibility (STOI). The experimental results show that frequency perturbation performs the best among the three perturbations. In particular, we find that frequency perturbation reduces the error of misclassifying a noise pattern as a speech pattern.

引用

页码：83 / 90

页数：8

共 16 条

[1]

[Anonymous], 1969, IEEE T ACOUST SPEECH, VAU17, P225

[2]

[Anonymous], 2013, P 30 INT C MACH LEAR

[3]

Dahl GE, 2013, INT CONF ACOUST SPEE, P8609, DOI 10.1109/ICASSP.2013.6639346

[4]

Duchi J, 2011, J MACH LEARN RES, V12, P2121

[5] An algorithm to improve speech recognition in noise for hearing-impaired listeners [J].

Healy, Eric W. ;

Yoho, Sarah E. ;

Wang, Yuxuan ;

Wang, DeLiang .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2013, 134 (04) :3029-3038

[6] Spectral Magnitude Minimum Mean-Square Error Estimation Using Binary and Continuous Gain Functions [J].

Jensen, Jesper ;

Hendriks, Richard C. .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (01) :92-102

[7]

Kanda N, 2013, 2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), P309, DOI 10.1109/ASRU.2013.6707748

[8] An algorithm that improves speech intelligibility in noise for normal-hearing listeners [J].

Kim, Gibak ;

Lu, Yang ;

Hu, Yi ;

Loizou, Philipos C. .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2009, 126 (03) :1486-1494

[9] Factors influencing intelligibility of ideal binary-masked speech: Implications for noise reduction [J].

Li, Ning ;

Loizou, Philipos C. .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2008, 123 (03) :1673-1682

[10]

Narayanan A, 2013, INT CONF ACOUST SPEE, P7092, DOI 10.1109/ICASSP.2013.6639038

← 1 2 →