MDL-based context-dependent subword modeling for speech recognition

被引:0
|
作者
Shinoda, Koichi [1 ]
Watanabe, Takao [1 ]
机构
[1] NEC Corp, Kawasaki, Japan
关键词
Markov processes - Mathematical models - Maximum likelihood estimation - Pattern recognition systems - Speech analysis;
D O I
暂无
中图分类号
学科分类号
摘要
Context-dependent phone units, such as triphones, have recently come to be used to model subword units in speech recognition systems that are based on the use of hidden Markov models (HMMs). While most such systems employ clustering of the HMM parameters (e.g., subword clustering and state clustering) to control the HMM size, so as to avoid poor recognition accuracy due to a lack of training data, none of them provide any effective criteria for determining the optimal number of clusters. This paper proposes a method in which state clustering is accomplished by way of phonetic decision trees and in which the minimum description length (MDL) criterion is used to optimize the number of clusters. Large-vocabulary Japanese-language recognition experiments show that this method achieves higher accuracy than the maximum-likelihood approach.
引用
收藏
页码:79 / 86
相关论文
共 50 条
  • [41] Context-dependent classes in a hybrid recurrent network-HMM speech recognition system
    Kershaw, D
    Robinson, T
    Hochberg, M
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 8: PROCEEDINGS OF THE 1995 CONFERENCE, 1996, 8 : 750 - 756
  • [42] Benchmarking Speech Synchronized Facial Animation Based on Context-Dependent Visemes
    De Martino, Jose Mario
    Violaro, Fabio
    WSCG 2007, FULL PAPERS PROCEEDINGS I AND II, 2007, : 105 - +
  • [43] System for context-dependent user modeling
    Nurmi, Petteri
    Salden, Alfons
    Lau, Sian Lun
    Suomela, Jukka
    Sutterer, Michael
    Millerat, Jean
    Martin, Miquel
    Lagerspetz, Eemil
    Poortinga, Remco
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS 2006: OTM 2006 WORKSHOPS, PT 2, PROCEEDINGS, 2006, 4278 : 1894 - 1903
  • [44] Eigentriphones for Context-Dependent Acoustic Modeling
    Ko, Tom
    Mak, Brian
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (06): : 1285 - 1294
  • [45] Subword unit based speech recognition in car environments
    Fischer, A
    Stahl, V
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 257 - 260
  • [46] A Quinphone-Based Context-Dependent Acoustic Modeling for LVCSR
    Sahu, Priyanka
    Dua, Mohit
    RECENT DEVELOPMENTS IN INTELLIGENT COMPUTING, COMMUNICATION AND DEVICES, ICCD 2016, 2017, 555 : 105 - 111
  • [47] OWL-based context-dependent task modeling and deducing
    Ni, Hongbo
    Zhou, Xingshe
    Yu, Zhiwen
    Miao, Kejian
    21ST INTERNATIONAL CONFERENCE ON ADVANCED NETWORKING AND APPLICATIONS WORKSHOPS/SYMPOSIA, VOL 2, PROCEEDINGS, 2007, : 846 - 851
  • [48] Context-dependent recognition memory: The ICE theory
    Murnane, K
    Phelps, MP
    Malmberg, K
    JOURNAL OF EXPERIMENTAL PSYCHOLOGY-GENERAL, 1999, 128 (04) : 403 - 415
  • [49] Video context-dependent effects in recognition memory
    Isarida, Takeo
    Isarida, Toshiko K.
    Kubota, Takayuki
    Nakajima, Saki
    Yagi, Kosei
    Yamamoto, Aoi
    Higuma, Miyoko
    JOURNAL OF MEMORY AND LANGUAGE, 2020, 113
  • [50] Context-dependent similarity effects in letter recognition
    Sachiko Kinoshita
    Serje Robidoux
    Daniel Guilbert
    Dennis Norris
    Psychonomic Bulletin & Review, 2015, 22 : 1458 - 1464