NON-LINEAR NOISE COMPENSATION FOR ROBUST SPEECH RECOGNITION USING GAUSS-NEWTON METHOD

被引:0
作者
Zhao, Yong [1 ]
Juang, Biing-Hwang [1 ]
机构
[1] Georgia Inst Technol, Ctr Signal & Image Proc, Atlanta, GA 30332 USA
来源
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2011年
关键词
Gauss-Newton method; non-linear compensation; robust speech recognition; vector Taylor series;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we present the Gauss-Newton method as a unified approach to optimizing non-linear noise compensation models, such as vector Taylor series (VTS), data-driven parallel model combination (DPMC), and unscented transform (UT). We demonstrate that the commonly used approaches that iteratively approximate the noise parameters in an EM framework are variants of the Gauss-Newton method. Through the formulation of the Gauss-Newton method for estimating noise means and variances, the noise estimation problems are reduced to determining the Jacobians of the noisy speech distributions. For the sampling-based compensations, we present two methods, sample Jacobian average (SJA) and cross-covariance (XCOV), to evaluate the Jacobians. Experiments on the Aurora 2 database verify the efficacy of the Gauss-Newton method to these noise compensation models.
引用
收藏
页码:4796 / 4799
页数:4
相关论文
共 38 条
[21]   Three-dimensional joint inversion of gravity and magnetic data using Gramian constraints and Gauss-Newton method [J].
Kong RuiJin ;
Hu XiangYun ;
Cai HongZhu .
CHINESE JOURNAL OF GEOPHYSICS-CHINESE EDITION, 2023, 66 (08) :3493-3513
[22]   Noise Compensation for Speech Recognition Using Subspace Gaussian Mixture Models [J].
Bouallegue, Mohamed ;
Rouvier, Mickael ;
Matrouf, Driss ;
Linares, Georges .
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, :318-321
[23]   Joint Tracking of Clean Speech and Noise Using HMMs and Particle Filters for Robust Speech Recognition [J].
Mushtaq, Aleem ;
Lee, Chin-Hui .
2012 CONFERENCE RECORD OF THE FORTY SIXTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2012, :1618-1622
[24]   Linear spectral transformation for robust speech recognition using maximum mutual information [J].
Kim, Donghyun ;
Yook, Dongsuk .
IEEE SIGNAL PROCESSING LETTERS, 2007, 14 (07) :496-499
[25]   NOISE ADAPTIVE TRAINING USING A VECTOR TAYLOR SERIES APPROACH FOR NOISE ROBUST AUTOMATIC SPEECH RECOGNITION [J].
Kalinli, Ozlem ;
Seltzer, Michael L. ;
Acero, Alex .
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, :3825-3828
[26]   ROBUST SPEECH RECOGNITION USING BEAMFORMING WITH ADAPTIVE MICROPHONE GAINS AND MULTICHANNEL NOISE REDUCTION [J].
Zhao, Shengkui ;
Xiao, Xiong ;
Zhang, Zhaofeng ;
Thi Ngoc Tho Nguyen ;
Zhong, Xionghu ;
Ren, Bo ;
Wang, Longbiao ;
Jones, Douglas L. ;
Chng, Eng Siong ;
Li, Haizhou .
2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, :460-467
[27]   MODIFIED SPLICE AND ITS EXTENSION TO NON-STEREO DATA FOR NOISE ROBUST SPEECH RECOGNITION [J].
Kumar, D. S. Pavan ;
Prasad, N. Vishnu ;
Joshi, Vikas ;
Umesh, S. .
2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2013, :174-179
[28]   Maximum Likelihood Model Adaptation Using Piecewise Linear Transformation for Robust Speech Recognition [J].
Lue, Yong ;
Wu, Zhenyang .
ISCE: 2009 IEEE 13TH INTERNATIONAL SYMPOSIUM ON CONSUMER ELECTRONICS, VOLS 1 AND 2, 2009, :479-481
[29]   Feature Adaptation Using Linear Spectro-Temporal Transform for Robust Speech Recognition [J].
Duc Hoang Ha Nguyen ;
Xiao, Xiong ;
Chng, Eng Siong ;
Li, Haizhou .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (06) :1006-1019
[30]   ON USING THE AUDITORY IMAGE MODEL AND INVARIANT-INTEGRATION FOR NOISE ROBUST AUTOMATIC SPEECH RECOGNITION [J].
Mueller, Florian ;
Mertins, Alfred .
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, :4905-4908