A robust Viterbi algorithm against impulsive noise with application to speech recognition

被引：14

作者：

Siu, Manhung ^{[1
]}

Chan, Arthur ^{[1
]}

机构：

[1] Hong Kong Univ Sci & Technol, Dept Elect & Elect Engn, Kowloon, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2006年 / 14卷 / 06期

关键词：

noisy environment; robustness; search algorithm; speech recognition; Viterbi algorithm;

D O I：

10.1109/TASL.2006.872592

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The Viterbi algorithm has been successfully applied to different pattern recognition and communication tasks. However, if some observations are corrupted by unknown impulsives noise which are not accounted for by the distortion measures, recognition performance can degrade significantly. In this paper, we propose a robust Viterbi algorithm to handle short impulsive noises with unknown characteristics by means of joint decoding and detection during the Viterbi search. To make the algorithm applicable to different noisy conditions with varying amounts of impulsive noise, we further proposed an approach to efficiently estimate the number of corruptions. -We demonstrate the effectiveness of the proposed robust algorithms using spoken digit recognition experiments under two different impulsive noise environments. Under random Gaussian replacement noise, the proposed algorithm reduced digit error by more than 65%. Under the GSM network environment in which lost frames are replaced by interpolated neighboring frames, the robust algorithm reduced digit error by 20%. Furthermore, the proposed algorithm does not degrade performance when impulsive noise is not present.

引用

页码：2122 / 2133

页数：12

共 34 条

[1]

ANDRASSY B, 2001, P 7 EUR C SPEECH COM, P193

[2]

[Anonymous], 1983, CLASSIFICATION REGRE

[3] An omnifont open-vocabulary OCR system for English and Arabic [J].

Bazzi, I ;

Schwartz, R ;

Makhoul, J .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1999, 21 (06) :495-504

[4]

BERNARD A, 2002, P IEEE ITN C AC SPEE, P2213

[5]

BERNARD A, 2001, P EUR C SPEECH COMM, P2703

[6]

Bickel P. J., 1977, MATH STAT BASIC IDEA

[7] Graceful degradation of speech recognition performance over packet-erasure networks [J].

Boulis, C ;

Ostendorf, M ;

Riskin, EA ;

Otterson, S .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (08) :580-590

[8]

CHAN A, 2000, P INT C SPOK LANG PR, P294

[9] Towards a robust real-time decoder [J].

Davenport, J ;

Schwartz, R ;

Nguyen, L .

ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, :645-648

[10]

DeRose S. J., 1988, Computational Linguistics, V14, P31

← 1 2 3 4 →