Robust Speech Recognition Using a Harmonic Model

被引：0

作者：

许超

曹志刚

机构：

[1] China

[2] Tsinghua University

[3] Beijing 100084

[4] Department of Electronic Engineering

来源：

TsinghuaScienceandTechnology | 2004年 / 02期

基金：

中国国家自然科学基金;

关键词：

robust speech recognition; speech enhancement; pitch extraction; harmonic model;

D O I：

暂无

中图分类号：

TN912.3 [语音信号处理];

学科分类号：

0711 ;

摘要：

Automatic speech recognition under conditions of a noisy environment remains a challenging problem. Traditionally, methods focused on noise structure, such as spectral subtraction, have been em-ployed to address this problem, and thus the performance of such methods depends on the accuracy in noise estimation. In this paper, an alternative method, using a harmonic-based spectral reconstruction algo-rithm, is proposed for the enhancement of robust automatic speech recognition. Neither noise estimation nor noise-model training are required in the proposed approach. A spectral subtraction integrated autocorrela-tion function is proposed to determine the pitch for the harmonic model. Recognition results show that the harmonic-based spectral reconstruction approach outperforms spectral subtraction in the middle- and low-signal noise ratio (SNR) ranges. The advantage of the proposed method is more manifest for non-stationary noise, as the algorithm does not require an assumption of stationary noise.

引用

页码：202 / 206

页数：5

共 50 条

[41] Speech enhancement for robust automatic speech recognition: Evaluation using a baseline system and instrumental measures [J].

Moore, A. H. ;

Parada, P. Peso ;

Naylor, P. A. .

COMPUTER SPEECH AND LANGUAGE, 2017, 46 :574-584

[42] A spatio-temporal speech enhancement scheme for robust speech recognition in noisy environments [J].

Visser, E ;

Otsuka, M ;

Lee, TW .

SPEECH COMMUNICATION, 2003, 41 (2-3) :393-407

[43] Joint Tracking of Clean Speech and Noise Using HMMs and Particle Filters for Robust Speech Recognition [J].

Mushtaq, Aleem ;

Lee, Chin-Hui .

2012 CONFERENCE RECORD OF THE FORTY SIXTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2012, :1618-1622

[44] The use of phase in complex spectrum subtraction for robust speech recognition [J].

Kleinschmidt, Tristan ;

Sridharan, Sridha ;

Mason, Michael .

COMPUTER SPEECH AND LANGUAGE, 2011, 25 (03) :585-600

[45] Stereo-based histogram equalization for robust speech recognition [J].

Randa Al-Wakeel ;

Mahmoud Shoman ;

Magdy Aboul-Ela ;

Sherif Abdou .

EURASIP Journal on Audio, Speech, and Music Processing, 2015

[46] A Global Discriminant Joint Training Framework for Robust Speech Recognition [J].

Li, Lujun ;

Kuerzinger, Ludwig ;

Watzel, Tobias ;

Rigoll, Gerhard .

2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, :544-551

[47] Stereo-based histogram equalization for robust speech recognition [J].

Al-Wakeel, Randa ;

Shoman, Mahmoud ;

Aboul-Ela, Magdy ;

Abdou, Sherif .

EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2015,

[48] Parameter Estimation of a State-Space Model of Noise for Robust Speech Recognition [J].

Windmann, Stefan ;

Haeb-Umbach, Reinhold .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (08) :1577-1590

[49] ACOUSTIC MODEL ADAPTATION VIA LINEAR SPLINE INTERPOLATION FOR ROBUST SPEECH RECOGNITION [J].

Seltzer, Michael L. ;

Acero, Alex ;

Kalgaonkar, Kaustubh .

2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, :4550-4553

[50] Performance Analysis of Hybrid Model of Robust Automatic Continuous Speech Recognition System [J].

Babu, C. Ganesh ;

Sampath, P. ;

Hariharan, S. ;

Balakumar, S. ;

Noufal, Mohamed .

PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTING AND INFORMATICS (ICICI 2017), 2017, :303-306

← 1 2 3 4 5 →