Visual Speech Recognition: a solution from feature extraction to words classification

被引:11
作者
Da Silveira, L [1 ]
Facon, J [1 ]
Borges, DL [1 ]
机构
[1] Cambury Coll, Fac Cambury, Goiania, Go, Brazil
来源
XVI BRAZILIAN SYMPOSIUM ON COMPUTER GRAPHICS AND IMAGE PROCESSING, PROCEEDINGS | 2003年
关键词
D O I
10.1109/SIBGRA.2003.1241036
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Audio-visual Speech Recognition has been an active area of research lately. A bit, and yet unsolved, part of this problem is the visual only recognition, or lip reading. Considering an image sequence of a person pronouncing a word, a full image analysis solution would have to segment the mouth area, extract relevant features, and use them to be able to classify the word from those visual features. In this paper we approach this problem by proposing a segmentation technique for the lips contours together with a set of features based on the extracted contours which is able to perform lip reading with promising results. We have collected visual speech sequences in our lab and show the results here for a set of ten words in Brazilian Portuguese, spoken by different speakers in more than 150 samples. The approach can be extended and applied to other spoken languages as well.
引用
收藏
页码:399 / 405
页数:7
相关论文
共 12 条
[1]  
ABUTALEB A, 1989, COMPUTER GRAPHICS IM, V41, P22
[2]   Lip detection and tracking. [J].
Caplier, A .
11TH INTERNATIONAL CONFERENCE ON IMAGE ANALYSIS AND PROCESSING, PROCEEDINGS, 2001, :8-13
[3]  
Davies E, 1997, MACHINE VISION THEOR
[4]  
Faruquie TA, 2000, INT C PATT RECOG, P106, DOI 10.1109/ICPR.2000.903496
[5]   Creating a multiuser 3-D virtual environment [J].
Leung, WH ;
Chen, T .
IEEE SIGNAL PROCESSING MAGAZINE, 2001, 18 (03) :9-16
[6]  
LIE W, 1998, P INT C SIGN PROC IC, P7
[7]  
LIEW A, 2000, ELECTRON LETT, V22, P1272
[8]   Extraction of visual features for lipreading [J].
Matthews, I ;
Cootes, TF ;
Bangham, JA ;
Cox, S ;
Harvey, R .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (02) :198-213
[9]  
MATTHEWS I, 1998, THESIS U E ANGLIA UK
[10]   A TUTORIAL ON HIDDEN MARKOV-MODELS AND SELECTED APPLICATIONS IN SPEECH RECOGNITION [J].
RABINER, LR .
PROCEEDINGS OF THE IEEE, 1989, 77 (02) :257-286