Lip Shape and Hand Position Fusion for Automatic Vowel Recognition in Cued Speech for French

被引：16

作者：

Heracleous, Panikos ^{[1
]}

Aboutabit, Noureddine ^{[1
]}

Beautemps, Denis ^{[1
]}

机构：

[1] Domaine Univ, Speech & Cognit Dept, Gipsa Lab, F-38402 Grenoble, France

来源：

IEEE SIGNAL PROCESSING LETTERS | 2009年 / 16卷 / 05期

关键词：

Concatenative fusion; Cued Speech; HMM; multistream HMM decision fusion; vowel recognition;

D O I：

10.1109/LSP.2009.2016011

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Cued Speech is a visual mode of communication that uses handshapes and placements in combination with the mouth movements of speech to make the phonemes of a spoken language look different from each other and clearly understandable to deaf and hearing-impaired people. The aim of Cued Speech is to overcome the problems of lip reading and thus enable deaf children and adults to wholly understand spoken language. Cued Speech recognition requires hand gesture recognition and lip shape recognition, and also integration of the two components. This article presents hidden Markov model (HMM)-based vowel recognition as used in Cued Speech for French. Based on concatenative feature fusion and multistream HMM decision fusion, lip shape and hand position components were integrated into a single component, and automatic vowel recognition was realized. In the case of multistream HMM decision fusion, the obtained vowel classification accuracy using lip shape and hand position information was 87.6%, showing absolute improvement of 19.6% in comparison with a use restricted only to lip parameters.

引用

页码：339 / 342

页数：4

共 15 条

[11] PHYSICAL CHARACTERISTICS OF THE LIPS UNDERLYING VOWEL LIPREADING PERFORMANCE
MONTGOMERY, AA
JACKSON, PL
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1983, 73 (06) : 2134 - 2144
[12] CUED SPEECH AND THE RECEPTION OF SPOKEN LANGUAGE
NICHOLLS, GH
LING, D
[J]. JOURNAL OF SPEECH AND HEARING RESEARCH, 1982, 25 (02): : 262 - 269
[13] VISEMES OBSERVED BY HEARING-IMPAIRED AND NORMAL-HEARING ADULT VIEWERS
OWENS, E
BLAZEK, B
[J]. JOURNAL OF SPEECH AND HEARING RESEARCH, 1985, 28 (03): : 381 - 393
[14] Recent advances in the automatic recognition of audiovisual speech
Potamianos, G
Neti, C
Gravier, G
Garg, A
Senior, AW
[J]. PROCEEDINGS OF THE IEEE, 2003, 91 (09) : 1306 - 1326
[15] UCHANSKI RM, 1994, J REHABIL RES DEV, V31, P20

← 1 2 →