Accent Issues in Large Vocabulary Continuous Speech Recognition

被引:46
作者
Chao Huang
Tao Chen
Eric Chang
机构
[1] Microsoft Research Asia,
关键词
automatic speech recognition; speaker variability; pronunciation modeling; accent adaptation; accent identification;
D O I
10.1023/B:IJST.0000017014.52972.1d
中图分类号
学科分类号
摘要
This paper addresses accent1 issues in large vocabulary continuous speech recognition. Cross-accent experiments show that the accent problem is very dominant in speech recognition. Analysis based on multivariate statistical tools (principal component analysis and independent component analysis) confirms that accent is one of the key factors in speaker variability. Considering different applications, we proposed two methods for accent adaptation. When a certain amount of adaptation data was available, pronunciation dictionary modeling was adopted to reduce recognition errors caused by pronunciation mistakes. When a large corpus was collected for each accent type, accent-dependent models were trained and a Gaussian mixture model-based accent identification system was developed for model selection. We report experimental results for the two schemes and verify their efficiency in each situation.
引用
收藏
页码:141 / 153
页数:12
相关论文
共 64 条
  • [1] Berkling K.(1998)Improving accent identification through knowledge of English syllable structure Proc. International Conference on Spoken Language Processing 2 89-92
  • [2] Zissman M.(2000)Large vocabulary mandarin speech recognition with different approaches in modeling tones Proc. International Conference on Spoken Language Processing 2 983-986
  • [3] Vonwiller J.(2002)On the use of Gaussian mixture model for speaker variability analysis Proc. International Conference on Spoken Language Processing 2 1249-1252
  • [4] Cleirigh C.(1977)Maximum likelihood from incomplete data via the EM algorithm Journal of the Royal Statistical Society, Series B 39 1-38
  • [5] Chang E.(1999)Fast accent identification and accented speech recognition Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing 1 221-224
  • [6] Zhou J.(2000)Cluster adaptive training of hidden Markov models IEEE Transactions on Speech and Audio Processing 8 417-428
  • [7] Huang C.(1995)Foreign accent classification using source generator based prosodic features Proc. International Conference on Acoustics, Speech, and Signal Processing 1 836-839
  • [8] Di S.(1933)Analysis of a complex of statistical variables into principle components J. Educ. Psychol. 24 417-441
  • [9] Lee K.F.(2000)Accent modeling based on pronunciation dictionary adaptation for large vocabulary Mandarin speech recognition Proc. International Conference on Spoken Language Processing 3 818-821
  • [10] Chen T.(2001)Analysis of speaker variability Proc. European Conference on Speech Communication and Technology. Denmark 2 1377-1380