LSTM model for visual speech recognition through facial expressions

被引：21

作者：

Bhaskar, Shabina ^{[1
]}

Thasleema, T. M. ^{[1
]}

机构：

[1] Cent Univ Kerala, Kasaragod, Kerala, India

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2023年 / 82卷 / 04期

关键词：

Audio-visual emotion recognition; Audio-visual speech recognition; Hearing impaired; Convolutional neural network; Long short term memory; FEATURES;

D O I：

10.1007/s11042-022-12796-1

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Hearing impaired persons are more expressive while speaking and expression is a salient feature in hearing impaired Visual Speech Recognition. Most Visual Speech Recognition systems focus only on the lip area for the recognition of speech or speaker. This work utilizes video data which includes information from both speech and facial expressions. As part of this study, we have developed a Malayalam audio-visual speech expression database of unimpaired people. The experiments were conducted on this newly developed Malayalam audio-visual speech database. The data has been collected from two people, 1 male, and 1 female. A combination of Convolutional Neural Network-Long Short Term Memory deep learning video processing model is applied for this system. The result demonstrate that, the classification accuracy is better for the features extracted using GoogleNet model compared to AlexNet and ResNet model. The system evaluation is carried out in both Speaker-dependent and speaker-independent domains. The recognition rate of the system for both speaker-dependent and speaker-independent experiments proves that facial expression analysis plays a crucial role in Visual Speech Recognition.

引用

页码：5455 / 5472

页数：18

共 47 条

[1]

[Anonymous], 2004, P 6 INT C MULT INT I, DOI [DOI 10.1145/1027933.1027968, 10.1145/1027933]

[2]

[Anonymous], 2008, Advances in neural information processing systems

[3]

[Anonymous], 2006, 22 INT C DAT ENG WOR, DOI [DOI 10.1109/ICDEW.2006.145, 10.1109/ICDEW.2006.145]

[4] A strategic approach to recognize the speech of the children with hearing impairment: different sets of features and models [J].