Lip Reading using Simple Dynamic Features and a Novel ROI for Feature Extraction

被引：2

作者：

Jain, Abhilash ^{[1
]}

Rathna, G. N. ^{[1
]}

机构：

[1] Indian Inst Sci, Bangalore, Karnataka, India

来源：

2018 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MACHINE LEARNING (SPML 2018) | 2018年

关键词：

Automatic lip reading; visual speech recognition; feature extraction;

D O I：

10.1145/3297067.3297083

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deaf or hard-of-hearing people mostly rely on lip-reading to understand speech. They demonstrate the ability of humans to understand speech from visual cues only. Automatic lip reading systems work in a similar fashion - by obtaining speech or text from just the visual information, like a video of a person's face. In this paper, an automatic lip reading system for spoken digit recognition is presented. The system uses simple dynamic features by creating difference images between consecutive frames of the video input. Using this technique, word recognition rates of 83.79% and 65.58% are achieved in speaker-dependent and speaker-independent testing scenarios, respectively. A novel, extended region-of-interest (ROI) which includes lower jaw and neck region is also introduced. Most lip-reading algorithms use only the mouth/lip region for relevant feature extraction. Over simple mouth as the ROI, the proposed ROI improves the performance by 4% in speaker-dependent tests and by 11% in speaker-independent tests.

引用

页码：73 / 77

页数：5

共 13 条

[1]

[Anonymous], 2011, P 28 INT C MACH LEAR

[2]

[Anonymous], 2001, PROC IEEE COMPUT SOC

[3] An audio-visual corpus for speech perception and automatic speech recognition (L) [J].

Cooke, Martin ;

Barker, Jon ;

Cunningham, Stuart ;

Shao, Xu .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2006, 120 (05) :2421-2424

[4]

Dietterich T. G., 1991, AAAI-91. Proceedings Ninth National Conference on Artificial Intelligence, P572

[5] PERCEPTUAL DOMINANCE DURING LIPREADING [J].

EASTON, RD ;

BASALA, M .

PERCEPTION & PSYCHOPHYSICS, 1982, 32 (06) :562-570

[6] Support vector machines [J].

Hearst, MA .

IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 1998, 13 (04) :18-21

[7]

Jain A, 2017, IEEE GLOB CONF SIG, P368, DOI 10.1109/GlobalSIP.2017.8308666

[8]

Li Y., 2016, P IEEE INT C COMM IC, P1

[9]

Petajan E. D., 1984, AUTOMATIC LIPREADING

[10]

Tao F, 2014, INTERSPEECH, P1154

← 1 2 →