Speech enhancement using a pitch predictive model

被引:6
作者
Buera, Luis [1 ]
Droppo, Jasha [2 ]
Acero, Alex [2 ]
机构
[1] Univ Zaragoza, GTC, E-50009 Zaragoza, Spain
[2] Microsoft Res, Speech Res Grp, Redmond, WA USA
来源
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12 | 2008年
关键词
speech enhancement; speech analysis; speech recognition; robustness;
D O I
10.1109/ICASSP.2008.4518752
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper we present two new methods for speech enhancement based on the previously publised fine pitch model (FPM) for voiced speech. The first method (FPM-NE) uses the FPM to produce a non-stationary noise estimate that can be used in any standard speech enhancement system. In this method, the FPM is used indirectly to perform speech enhancement. The second method we describe (FPM-SE) uses the FPM directly to perform speech enhancement. We present a study of the behavior of the two models on the standard Aurora 2 task, and demonstrate improvements of over 45% average word error rate reduction over the multi-style baseline.
引用
收藏
页码:4885 / +
页数:2
相关论文
共 8 条
[1]  
[Anonymous], 2000, ETSI 201 108 V112
[2]  
DROPPO J, 2007, P INT
[3]  
Ezzaidi H., 2001, EUROSPEECH, P2825
[4]  
Hirsch H.G, 2000, P ASR2000 AUT SPEECH
[5]  
Kim NS, 1998, IEEE SIGNAL PROC LET, V5, P57, DOI 10.1109/97.661559
[6]  
SELTZER ML, 2003, P EUR 03, P1277
[7]   Development of a Sign Language Dialogue System for a Healing Dialogue Robot [J].
Huang, Xuan ;
Wu, Bo ;
Kameda, Hiroyuki .
2021 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS DASC/PICOM/CBDCOM/CYBERSCITECH 2021, 2021, :867-872
[8]  
Yu AT, 2004, 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS, P729