Conversion from Phoneme Based to Grapheme Based Acoustic Models for Speech Recognition

被引:0
作者
Zgank, Andrej [1 ]
Kacic, Zdravko [1 ]
机构
[1] Univ Maribor, Digital Signal Proc Lab, SI-2000 Maribor, Slovenia
来源
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5 | 2006年
关键词
acoustic modeling; grapheme based; bootstraping; confusion matrix; speech recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper focuses on acoustic modeling in speech recognition. A novel approach how to build grapheme based acoustic models with conversion from existing phoneme based acoustic models is proposed. The grapheme based acoustic models are created as weighted sum from rnonophone acoustic models. The influence of particular monophone is determined with the phoneme to grapheme confusion matrix. Further, the context-dependent acoustic models are being trained within the grapheme training procedure. The decision tree based clustering approach is used to tie similar states. A modified data-driven method for generation of grapheme broad classes needed during the initialization of decision tree is being applied. The data-driven broad classes are created using the grapheme based confusion matrix. All experiments were performed with the Slovenian language (1000 FDB SpeechDat(II) database), which is a highly inflectional language with no fixed set of rules for grapheme to phoneme conversion. The achieved results showed improvements of speech recognition results with the proposed methods.
引用
收藏
页码:1587 / 1590
页数:4
相关论文
共 14 条
  • [1] Beulen K, 1998, INT CONF ACOUST SPEE, P805, DOI 10.1109/ICASSP.1998.675387
  • [2] CHELBA C, P ICSLP 2002 DENV CO
  • [3] DIEHL F, P ICSLP 2004 JEJ ISL
  • [4] KAISER J, 1998, SPEECH DATABASE DEV
  • [5] KANTHAK S, P EUR 2003 GEN SWITZ
  • [6] KANTHAK S, P ICASSP 2002 ORL FL
  • [7] KILLER M, P EUR 2003 GEN SWITZ
  • [8] Automatic clustering and generation of contextual questions for tied states in hidden Markov models
    Singh, R
    Raj, B
    Stern, RM
    [J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 117 - 120
  • [9] Annotation in the SpeechDat projects
    Van den Heuvel H.
    Boves L.
    Moreno A.
    Omologo M.
    Richard G.
    Sanders E.
    [J]. International Journal of Speech Technology, 2001, 4 (02) : 127 - 143
  • [10] WOODLAND PC, P ICASSP 1994