Lip Tracking Method for the System of Audio-Visual Polish Speech Recognition

被引:0
|
作者
Kubanek, Mariusz [1 ]
Bobulski, Janusz [1 ]
Adrjanowicz, Lukasz [1 ]
机构
[1] Czestochowa Tech Univ, Inst Comp & Informat Sci, PL-42200 Czestochowa, Poland
关键词
lip reading; visual speech; audio visual speech recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a method of tracking the lips in the system of audio-visual speech recognition. Presented methods consists of a face detector, face tracker, lip detector, lip tracker, and word classifier. In speech recognition systems, the audio signal is exposed to a large amount of acoustic noise, therefor scientists are looking for ways to reduce audio interference on recognition results. Visual speech is one of the sources that is not perturbed by the acoustic environment and noise. To analyze the video speech one has to develop a method of lip tracking. This work presents a method for automatic detection of the outer edges of the lips, which was used to identify individual words in audio-visual speech recognition. Additionally the paper also shows how to use video speech to divide the audio signal into phonemes.
引用
收藏
页码:535 / 542
页数:8
相关论文
共 50 条
  • [1] Realtime lip contour tracking for audio-visual speech recognition applications
    Yazdi, Mehran
    Seyfi, Mehdi
    Rafati, Amirhossein
    Asadi, Meghdad
    World Academy of Science, Engineering and Technology, 2009, 40 : 164 - 167
  • [2] Lip movement synthesis in audio-visual speech recognition system
    Li, JQ
    Yin, YX
    PROCEEDINGS OF THE 2005 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (IEEE NLP-KE'05), 2005, : 461 - 465
  • [3] Lip movement synthesis in audio-visual speech recognition system
    Li, Junquan
    Yin, Yixin
    Proc. 2005 IEEE Int. Conf. on Lang. Process. Knowl. Engin. IEEE NLP-KE '05, (461-465):
  • [4] Analysis of lip geometric features for audio-visual speech recognition
    Kaynak, MN
    Zhi, Q
    Cheok, AD
    Sengupta, K
    Han, Z
    Chung, KC
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2004, 34 (04): : 564 - 570
  • [5] An audio-visual speech recognition system for testing new audio-visual databases
    Pao, Tsang-Long
    Liao, Wen-Yuan
    VISAPP 2006: PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 2, 2006, : 192 - +
  • [6] RBF neural network mouth tracking for audio-visual speech recognition system
    Hui, LE
    Seng, KP
    Tse, KM
    TENCON 2004 - 2004 IEEE REGION 10 CONFERENCE, VOLS A-D, PROCEEDINGS: ANALOG AND DIGITAL TECHNIQUES IN ELECTRICAL ENGINEERING, 2004, : A84 - A87
  • [7] Method of speech recognition and speaker identification using audio-visual of polish speech and hidden Markov models
    Kubanek, Mariusz
    BIOMETRICS, COMPUTER SECURITY SYSTEMS AND ARTIFICIAL INTELLIGENCE APPLICATIONS, 2006, : 45 - 55
  • [8] Connectionism based audio-visual speech recognition method
    Che, Na
    Zhu, Yi-Ming
    Zhao, Jian
    Sun, Lei
    Shi, Li-Juan
    Zeng, Xian-Wei
    Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2024, 54 (10): : 2984 - 2993
  • [9] Audio-visual speech recognition based on joint training with audio-visual speech enhancement for robust speech recognition
    Hwang, Jung-Wook
    Park, Jeongkyun
    Park, Rae-Hong
    Park, Hyung-Min
    APPLIED ACOUSTICS, 2023, 211
  • [10] Lips Detection for Audio-Visual Speech Recognition System
    Chin, Siew Wen
    Ang, Li-Minn
    Seng, Kah Phooi
    2008 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATIONS SYSTEMS (ISPACS 2008), 2008, : 311 - 314