Improved speech recognition using adaptive audio-visual fusion via a stochastic secondary classifier

被引：5

作者：

Lucey, S ^{[1
]}

Sridharan, S ^{[1
]}

Chandran, V ^{[1
]}

机构：

[1] Queensland Univ Technol, Sch Elect & Elect Syst Engn, RCSAVT, Speech Res Lab, Brisbane, Qld 4001, Australia

来源：

PROCEEDINGS OF 2001 INTERNATIONAL SYMPOSIUM ON INTELLIGENT MULTIMEDIA, VIDEO AND SPEECH PROCESSING | 2001年

关键词：

D O I：

10.1109/ISIMP.2001.925455

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The adaptive fusion of video and audio is one of the fundamental pursuits of audio visual speech recognition (AVSR). In this paper the use of a high dimensional secondary classifier ore the word likelihood scores from both the audio and video modalities is investigated for the purposes of adaptive fusion. Results are presented that lie above or equal to the boundary of catastrophic fusion across a number of audio noise levels.

引用

页码：551 / 554

页数：4