Large vocabulary audio-visual speech recognition using the Janus speech recognition toolkit

被引：0

作者：

Kratt, J ^{[1
]}

Metze, F ^{[1
]}

Stiefelhagen, R ^{[1
]}

Waibel, A ^{[1
]}

机构：

[1] Univ Karlsruhe, Interact Syst Labs, Karlsruhe, Germany

来源：

PATTERN RECOGNITION | 2004年 / 3175卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper describes audio-visual speech recognition experiments on a multi-speaker, large vocabulary corpus using the Janus speech recognition toolkit. We describe a complete audio-visual speech recognition system and present experiments on this corpus. By using visual cues as additional input to the speech recognizer, we observed good improvements, both on clean and noisy speech in our experiments.

引用

页码：488 / 495

页数：8

共 23 条

[1]

BREGLER C, 1994, INT CONF ACOUST SPEE, P669, DOI 10.1109/ICASSP.1994.389567

[2]

DELIGNE S, 2002, IEEE WORKSH SENS ARR

[3]

DELIGNE S, ICSLP 2002

[4]

Duchnowski P., 1994, ICSLP 94. 1994 International Conference on Spoken Language Processing, P547

[5] Audio-Visual Speech Modeling for Continuous Speech Recognition [J].

Dupont, Stephane ;

Luettin, Juergen .

IEEE TRANSACTIONS ON MULTIMEDIA, 2000, 2 (03) :141-151

[6]

FINKE M, 1997, P ICASSP MUN GERM

[7]

GOECKE R, 2002, ICASSP 02

[8]

GOLDSCHEN AJ, 28 ANN AS C SIGN SPE

[9]

GRAVIER G, 2002, P HUM LANG TECHN C

[10]

HENNECKE ME, 28 ANN AS C SIGN SPE

← 1 2 3 →