Initial evaluation of hidden dynamic models on conversational speech

被引：18

作者：

Picone, J ^{[1
]}

Pike, S ^{[1
]}

Regan, R ^{[1
]}

Kamm, T ^{[1
]}

Bridle, J ^{[1
]}

Deng, L ^{[1
]}

Ma, Z ^{[1
]}

Richards, H ^{[1
]}

Schuster, M ^{[1
]}

机构：

[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA

来源：

ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI | 1999年

关键词：

D O I：

10.1109/ICASSP.1999.758074

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Conversational speech recognition is a challenging problem primarily because speakers rarely fully articulate sounds. A successful speech recognition approach must infer intended spectral targets from the speech data, or develop a method of dealing with large variances in the data. Hidden Dynamic Models (HDMs) attempt to automatically learn such targets in a hidden feature space using models that integrate linguistic information with constrained temporal trajectory models. HDMs are a radical departure from conventional hidden Markov models (HMMs), which simply account for variation in the observed data. In this paper, we present an initial evaluation of such models on a conversational speech recognition task involving a subset of the SWITCHBOARD corpus. We show that in an N-Best rescoring paradigm, HDMs are capable of delivering performance competitive with HMMs.

引用

页码：109 / 112

页数：4