Lip Tracking Method for the System of Audio-Visual Polish Speech Recognition

被引：0

作者：

Kubanek, Mariusz ^{[1
]}

Bobulski, Janusz ^{[1
]}

Adrjanowicz, Lukasz ^{[1
]}

机构：

[1] Czestochowa Tech Univ, Inst Comp & Informat Sci, PL-42200 Czestochowa, Poland

来源：

ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, PT I | 2012年 / 7267卷

关键词：

lip reading; visual speech; audio visual speech recognition;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper proposes a method of tracking the lips in the system of audio-visual speech recognition. Presented methods consists of a face detector, face tracker, lip detector, lip tracker, and word classifier. In speech recognition systems, the audio signal is exposed to a large amount of acoustic noise, therefor scientists are looking for ways to reduce audio interference on recognition results. Visual speech is one of the sources that is not perturbed by the acoustic environment and noise. To analyze the video speech one has to develop a method of lip tracking. This work presents a method for automatic detection of the outer edges of the lips, which was used to identify individual words in audio-visual speech recognition. Additionally the paper also shows how to use video speech to divide the audio signal into phonemes.

引用

页码：535 / 542

页数：8

共 50 条

[1] Realtime lip contour tracking for audio-visual speech recognition applications
Yazdi, Mehran
Seyfi, Mehdi
Rafati, Amirhossein
Asadi, Meghdad
World Academy of Science, Engineering and Technology, 2009, 40 : 164 - 167
[2] Lip movement synthesis in audio-visual speech recognition system
Li, JQ
Yin, YX
PROCEEDINGS OF THE 2005 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (IEEE NLP-KE'05), 2005, : 461 - 465
[3] Lip movement synthesis in audio-visual speech recognition system
Li, Junquan
Yin, Yixin
Proc. 2005 IEEE Int. Conf. on Lang. Process. Knowl. Engin. IEEE NLP-KE '05, (461-465):
[4] Analysis of lip geometric features for audio-visual speech recognition
Kaynak, MN
Zhi, Q
Cheok, AD
Sengupta, K
Han, Z
Chung, KC
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2004, 34 (04): : 564 - 570
[5] An audio-visual speech recognition system for testing new audio-visual databases
Pao, Tsang-Long
Liao, Wen-Yuan
VISAPP 2006: PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 2, 2006, : 192 - +
[6] RBF neural network mouth tracking for audio-visual speech recognition system
Hui, LE
Seng, KP
Tse, KM
TENCON 2004 - 2004 IEEE REGION 10 CONFERENCE, VOLS A-D, PROCEEDINGS: ANALOG AND DIGITAL TECHNIQUES IN ELECTRICAL ENGINEERING, 2004, : A84 - A87
[7] Method of speech recognition and speaker identification using audio-visual of polish speech and hidden Markov models
Kubanek, Mariusz
BIOMETRICS, COMPUTER SECURITY SYSTEMS AND ARTIFICIAL INTELLIGENCE APPLICATIONS, 2006, : 45 - 55
[8] Connectionism based audio-visual speech recognition method
Che, Na
Zhu, Yi-Ming
Zhao, Jian
Sun, Lei
Shi, Li-Juan
Zeng, Xian-Wei
Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2024, 54 (10): : 2984 - 2993
[9] Audio-visual speech recognition based on joint training with audio-visual speech enhancement for robust speech recognition
Hwang, Jung-Wook
Park, Jeongkyun
Park, Rae-Hong
Park, Hyung-Min
APPLIED ACOUSTICS, 2023, 211
[10] Lips Detection for Audio-Visual Speech Recognition System
Chin, Siew Wen
Ang, Li-Minn
Seng, Kah Phooi
2008 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATIONS SYSTEMS (ISPACS 2008), 2008, : 311 - 314

← 1 2 3 4 5 →