Using SVMs and discriminative models for speech recognition

被引：0

作者：

Smith, ND ^{[1
]}

Gales, MJF ^{[1
]}

机构：

[1] Univ Cambridge, Engn Dept, Cambridge CB2 1PZ, England

来源：

2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS | 2002年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In speech recognition, standard MAP decoders attribute speech data to the class with the highest posterior probability. This minimises the error rate under assumptions of model correctness. This assumption is invalid for speech recognition with HMMs. Hence, an interesting question is whether extra, useful information about the speech source can be extracted from the HMMs and used to lower error rates in practical systems. In this paper additional features are extracted from HMMs and incorporated into a multidimensional score-space. SVMs are then used to implement a decision rule, Preliminary experiments are performed on a small speaker-independent isolated letter task. Score-spaces based on discriminative models are used with previous results based on generative models, Both score-spaces outperform standard schemes.

引用

页码：77 / 80

页数：4

共 9 条

[1]

[Anonymous], ADV LARGE MARGIN CLA

[2]

[Anonymous], 1998, CSDTR9804 U LOND ROY

[3]

Fanty M., 1991, ADV NEURAL INFORM PR, P220

[4]

JAAKKOLA TS, 1999, ADV NEURAL INFORMATI, V11

[5]

Joachims J., 1999, ADV KERNEL METHODS S

[6] High-performance alphabet recognition [J].

Loizou, PC ;

Spanias, AS .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1996, 4 (06) :430-445

[7]

SMITH N, 2002, ADV NEURAL INFORMATI, V14

[8]

VAPNIK V, 1995, NATURE STATE LEARNIN

[9]

Woodland P.C., 2000, P ISCA ITRW ASR2000, P7

← 1 →