Normalized training for HMM-based visual speech recognition

被引:1
作者
Nankaku, Y [1 ]
Tokuda, K [1 ]
Kitamura, T [1 ]
Kobayashi, T [1 ]
机构
[1] Nagoya Inst Technol, Dept Comp Sci, Nagoya, Aichi 4668555, Japan
来源
2000 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL III, PROCEEDINGS | 2000年
关键词
D O I
10.1109/ICIP.2000.899338
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an approach to estimating the parameters of continuous density HMMs for visual speech recognition. One of the key issues of image-based visual speech recognition is normalization of lip location and lighting condition prior to estimating the parameters of HMMs. We presented a normalized training method in which the normalization process is integrated in the model training. This paper extends it for contrast normalization in addition to average-intensity and location normalization. The proposed method provides a theoretically-well-defined algorithm based on a maximum likelihood formulation, hence the likelihood for the training data is guaranteed to increase at each iteration of the normalized training. Experiments on M2VTS database show that the recognition performance can be significantly improved by the normalized training.
引用
收藏
页码:234 / 237
页数:4
相关论文
共 6 条
  • [1] Luettin J, 1996, ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, P58, DOI 10.1109/ICSLP.1996.607024
  • [2] Luettin J., 1997, P EUROPEAN C SPEECH, P1991
  • [3] MOVELLAN JR, 1995, ADV NEURAL INFORMATI, V7
  • [4] NANKAKU Y, 1999, P EUROSPEECH, P1287
  • [5] POTAMIANOS G, 1999, P EUROSPEECH, P1291
  • [6] VANEGAS O, 1998, P ICSLP, P289