MDL-based context-dependent subword modeling for speech recognition

被引：0

作者：

Shinoda, Koichi ^{[1
]}

Watanabe, Takao ^{[1
]}

机构：

[1] NEC Corp, Kawasaki, Japan

来源：

Journal of the Acoustical Society of Japan (E) (English translation of Nippon Onkyo Gakkaishi) | 2000年 / 21卷 / 02期

关键词：

Markov processes - Mathematical models - Maximum likelihood estimation - Pattern recognition systems - Speech analysis;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Context-dependent phone units, such as triphones, have recently come to be used to model subword units in speech recognition systems that are based on the use of hidden Markov models (HMMs). While most such systems employ clustering of the HMM parameters (e.g., subword clustering and state clustering) to control the HMM size, so as to avoid poor recognition accuracy due to a lack of training data, none of them provide any effective criteria for determining the optimal number of clusters. This paper proposes a method in which state clustering is accomplished by way of phonetic decision trees and in which the minimum description length (MDL) criterion is used to optimize the number of clusters. Large-vocabulary Japanese-language recognition experiments show that this method achieves higher accuracy than the maximum-likelihood approach.

引用

页码：79 / 86

共 50 条

[41] Context-dependent classes in a hybrid recurrent network-HMM speech recognition system
Kershaw, D
Robinson, T
Hochberg, M
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 8: PROCEEDINGS OF THE 1995 CONFERENCE, 1996, 8 : 750 - 756
[42] Benchmarking Speech Synchronized Facial Animation Based on Context-Dependent Visemes
De Martino, Jose Mario
Violaro, Fabio
WSCG 2007, FULL PAPERS PROCEEDINGS I AND II, 2007, : 105 - +
[43] System for context-dependent user modeling
Nurmi, Petteri
Salden, Alfons
Lau, Sian Lun
Suomela, Jukka
Sutterer, Michael
Millerat, Jean
Martin, Miquel
Lagerspetz, Eemil
Poortinga, Remco
ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS 2006: OTM 2006 WORKSHOPS, PT 2, PROCEEDINGS, 2006, 4278 : 1894 - 1903
[44] Eigentriphones for Context-Dependent Acoustic Modeling
Ko, Tom
Mak, Brian
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (06): : 1285 - 1294
[45] Subword unit based speech recognition in car environments
Fischer, A
Stahl, V
PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 257 - 260
[46] A Quinphone-Based Context-Dependent Acoustic Modeling for LVCSR
Sahu, Priyanka
Dua, Mohit
RECENT DEVELOPMENTS IN INTELLIGENT COMPUTING, COMMUNICATION AND DEVICES, ICCD 2016, 2017, 555 : 105 - 111
[47] OWL-based context-dependent task modeling and deducing
Ni, Hongbo
Zhou, Xingshe
Yu, Zhiwen
Miao, Kejian
21ST INTERNATIONAL CONFERENCE ON ADVANCED NETWORKING AND APPLICATIONS WORKSHOPS/SYMPOSIA, VOL 2, PROCEEDINGS, 2007, : 846 - 851
[48] Context-dependent recognition memory: The ICE theory
Murnane, K
Phelps, MP
Malmberg, K
JOURNAL OF EXPERIMENTAL PSYCHOLOGY-GENERAL, 1999, 128 (04) : 403 - 415
[49] Video context-dependent effects in recognition memory
Isarida, Takeo
Isarida, Toshiko K.
Kubota, Takayuki
Nakajima, Saki
Yagi, Kosei
Yamamoto, Aoi
Higuma, Miyoko
JOURNAL OF MEMORY AND LANGUAGE, 2020, 113
[50] Context-dependent similarity effects in letter recognition
Sachiko Kinoshita
Serje Robidoux
Daniel Guilbert
Dennis Norris
Psychonomic Bulletin & Review, 2015, 22 : 1458 - 1464

← 1 2 3 4 5 →