A robust speech disorders correction system for Arabic language using visual speech recognition

被引：0

作者：

Farag, Ahmed ^{[1
]}

El Adawy, Mohamed ^{[1
]}

Ismail, Ahmed ^{[2
]}

机构：

[1] HelwanUniversity, Dept Biomed Engn, Cairo, Egypt

[2] HTI, Dept Biomed Engn, Cairo, Egypt

来源：

BIOMEDICAL RESEARCH-INDIA | 2013年 / 24卷 / 02期

关键词：

Speech Processing; Visual Speech; Arabic Speech Recognition; Speech Disorders Classification and Lips Detection; AUDIOVISUAL SPEECH;

D O I：

暂无

中图分类号：

R318 [生物医学工程];

学科分类号：

0831 ;

摘要：

In this Paper, we propose an automatic speech disorders recognition technique based on both speech and visual components analysis. First, we performed the pre-processing steps required for speech recognition then we chose the Mel-frequency cepstral coefficients (MFCC's) as features representing the speech signal. On the other hand, we studied the visual components based on lipsmovements analysis. We propose a new technique that integrates both the audio signal and the video signal analysis techniques for increasing the efficiency of the automated speech disorders recognition systems. The main idea is to detect the motion features from a series of lipsimages. A new technique for lips movement detection is proposed. Finally we use the multi-layer neural network as a classifier for both speech and visual features. We propose a new technique for speech disorders correction systems, especially for Arabic language. Practical experiments showed that our system is useful when dealing with Arabic language speech disorders.

引用

页码：185 / 192

页数：8

共 13 条

[1]

Al-Alaoui Mohamad Adnan, 2008, SPEECH RECOGNITION U

[2]

[Anonymous], HEARING EYE PSYCHOL

[3] Audio-Visual Speech Modeling for Continuous Speech Recognition [J].

Dupont, Stephane ;

Luettin, Juergen .

IEEE TRANSACTIONS ON MULTIMEDIA, 2000, 2 (03) :141-151

[4]

Giannakopoulos T., 2009, THESIS U ATHENS GREE

[5]

Huang Xuedong, 2001, SPOKEN LANGUAGE PROC, P230

[6]

Massaro DW, 1998, AM SCI, V86, P236, DOI 10.1511/1998.3.236

[7] HEARING LIPS AND SEEING VOICES [J].

MCGURK, H ;

MACDONALD, J .

NATURE, 1976, 264 (5588) :746-748

[8]

Pinkus Allan, 1999, Approximation Theory of the MLP Model in Neural Networks

[9] Recent advances in the automatic recognition of audiovisual speech [J].

Potamianos, G ;

Neti, C ;

Gravier, G ;

Garg, A ;

Senior, AW .

PROCEEDINGS OF THE IEEE, 2003, 91 (09) :1306-1326

[10]

Salhi L., 2010, INT ARAB J INFORM TE, V7

← 1 2 →