Robust Speech Recognition Using a Harmonic Model

被引:0
作者
许超
曹志刚
机构
[1] China
[2] Tsinghua University
[3] Beijing 100084
[4] Department of Electronic Engineering
基金
中国国家自然科学基金;
关键词
robust speech recognition; speech enhancement; pitch extraction; harmonic model;
D O I
暂无
中图分类号
TN912.3 [语音信号处理];
学科分类号
0711 ;
摘要
Automatic speech recognition under conditions of a noisy environment remains a challenging problem. Traditionally, methods focused on noise structure, such as spectral subtraction, have been em-ployed to address this problem, and thus the performance of such methods depends on the accuracy in noise estimation. In this paper, an alternative method, using a harmonic-based spectral reconstruction algo-rithm, is proposed for the enhancement of robust automatic speech recognition. Neither noise estimation nor noise-model training are required in the proposed approach. A spectral subtraction integrated autocorrela-tion function is proposed to determine the pitch for the harmonic model. Recognition results show that the harmonic-based spectral reconstruction approach outperforms spectral subtraction in the middle- and low-signal noise ratio (SNR) ranges. The advantage of the proposed method is more manifest for non-stationary noise, as the algorithm does not require an assumption of stationary noise.
引用
收藏
页码:202 / 206
页数:5
相关论文
共 50 条
[41]   Speech enhancement for robust automatic speech recognition: Evaluation using a baseline system and instrumental measures [J].
Moore, A. H. ;
Parada, P. Peso ;
Naylor, P. A. .
COMPUTER SPEECH AND LANGUAGE, 2017, 46 :574-584
[42]   A spatio-temporal speech enhancement scheme for robust speech recognition in noisy environments [J].
Visser, E ;
Otsuka, M ;
Lee, TW .
SPEECH COMMUNICATION, 2003, 41 (2-3) :393-407
[43]   Joint Tracking of Clean Speech and Noise Using HMMs and Particle Filters for Robust Speech Recognition [J].
Mushtaq, Aleem ;
Lee, Chin-Hui .
2012 CONFERENCE RECORD OF THE FORTY SIXTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2012, :1618-1622
[44]   The use of phase in complex spectrum subtraction for robust speech recognition [J].
Kleinschmidt, Tristan ;
Sridharan, Sridha ;
Mason, Michael .
COMPUTER SPEECH AND LANGUAGE, 2011, 25 (03) :585-600
[45]   Stereo-based histogram equalization for robust speech recognition [J].
Randa Al-Wakeel ;
Mahmoud Shoman ;
Magdy Aboul-Ela ;
Sherif Abdou .
EURASIP Journal on Audio, Speech, and Music Processing, 2015
[46]   A Global Discriminant Joint Training Framework for Robust Speech Recognition [J].
Li, Lujun ;
Kuerzinger, Ludwig ;
Watzel, Tobias ;
Rigoll, Gerhard .
2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, :544-551
[47]   Stereo-based histogram equalization for robust speech recognition [J].
Al-Wakeel, Randa ;
Shoman, Mahmoud ;
Aboul-Ela, Magdy ;
Abdou, Sherif .
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2015,
[48]   Parameter Estimation of a State-Space Model of Noise for Robust Speech Recognition [J].
Windmann, Stefan ;
Haeb-Umbach, Reinhold .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (08) :1577-1590
[49]   ACOUSTIC MODEL ADAPTATION VIA LINEAR SPLINE INTERPOLATION FOR ROBUST SPEECH RECOGNITION [J].
Seltzer, Michael L. ;
Acero, Alex ;
Kalgaonkar, Kaustubh .
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, :4550-4553
[50]   Performance Analysis of Hybrid Model of Robust Automatic Continuous Speech Recognition System [J].
Babu, C. Ganesh ;
Sampath, P. ;
Hariharan, S. ;
Balakumar, S. ;
Noufal, Mohamed .
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTING AND INFORMATICS (ICICI 2017), 2017, :303-306