Lip reading of hearing impaired persons using HMM

被引:33
作者
Puviarasan, N. [1 ]
Palanivel, S. [1 ]
机构
[1] Annamalai Univ, Dept Comp Sci & Engn, Annamalainagar 608002, Tamil Nadu, India
关键词
Lip reading; Face detection; Mouth detection; Discrete cosine transform; Discrete wavelet transform; Hidden Markov model; EXTRACTION; SPEECH;
D O I
10.1016/j.eswa.2010.09.119
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes a method for lip reading of hearing impaired persons. The term lip reading refers to recognizing the spoken words using visual speech information such as lip movements. The visual speech video of the hearing impaired person is given as input to the face detection module for detecting the face region. The region of the mouth is determined relative to the face region. The mouth images are used for feature extraction. The features are extracted using discrete cosine transform (DCT) and discrete wavelet transform (DWT). Then, these features are applied separately as inputs to the hidden markov model (HMM) for recognizing the visual speech. To understand the visual speech of hearing impaired person in cash collection counters, 33 words are chosen. For each word, 20 samples are collected for training the HMM model and another five samples are used for testing the model. The experimental results show that the method gives the performance of 91.0% for the DCT based lip features and 97.0% for DWT based lip features. (C) 2010 Elsevier Ltd. All rights reserved.
引用
收藏
页码:4477 / 4481
页数:5
相关论文
共 15 条
[1]  
Alghathbar K., 2009, WSEAS T INFO SCI APP, V6, P829
[2]   Facial feature extraction using complex dual-tree wavelet transform [J].
Celik, Turgay ;
Ozkaramanli, Huseyin ;
Demirel, Hasan .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 111 (02) :229-246
[3]  
EI-Dahshan El-Sayed A., 2009, INFORMATICA, V104, P303
[4]   Training hidden Markov models by hybrid simulated annealing for visual speech recognition [J].
Jong-Seok Lee ;
Cheol Hoon Park .
2006 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-6, PROCEEDINGS, 2006, :198-+
[5]  
Lim Ee Hui, 2004, TENCON 2004. 2004 IEEE Region 10 Conference (IEEE Cat. No. 04CH37582), P84
[6]  
Meyer G. F., 2004, Information Fusion, V5, P91, DOI 10.1016/j.inffus.2003.07.001
[7]  
Nakamura K, 2002, 2002 IEEE ASIA-PACIFIC CONFERENCE ON ASIC PROCEEDINGS, P303, DOI 10.1109/APASIC.2002.1031592
[8]  
Pera V, 2003, 2003 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY, VOLS 1 AND 2, PROCEEDINGS, P688
[9]   WRAPPING SNAKES FOR IMPROVED LIP SEGMENTATION [J].
Ramage, Matthew ;
Lindsay, Euan .
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, :1205-1208
[10]  
Sagheer A, 2005, INT CONF ACOUST SPEE, P781