Prosodic Modeling in Large Vocabulary Mandarin Speech Recognition

被引：0

作者：

Huang, Jui-Ting ^{[1
]}

Lee, Lin-shan ^{[1
]}

机构：

[1] Natl Taiwan Univ, Grad Inst Commun Engn, Taipei, Taiwan

来源：

INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5 | 2006年

关键词：

prosody; speech recognition; Mandarin tone; lexical word boundaries;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The issue of incorporating prosodic information into speech recognition processes has emerged in recent years. In this work we present a complete framework for Mandarin speech recognition with prosodic modeling considering two-level hierarchical prosodic information for Mandarin Chinese. We developed a GMM-based, a decision-tree-based, and a hybrid approach. The best improvements in character recognition accuracy were obtained by the decision-tree-based prosodic models. This approach does NOT require a training corpus labeled with prosodic features, and works reasonably for a large-scale multi-speaker task.

引用

页码：1241 / 1244

页数：4

共 9 条

[1]

BOURLARD H, IEEE T NEURAL NETWOR, V4, P893

[2]

CHEN K, 2003, P EURO SPEECH GEN, P393

[3]

CHEN SH, IEEE T COMMUNICATION, V38, P1317

[4]

HIROSE K, 2004, P ICSLP

[5]

HUANG JT, 2006, SPEECH PROSODY

[6] Prosody-based automatic segmentation of speech into sentences and topics [J].

Shriberg, E ;

Stolcke, A ;

Hakkani-Tür, D ;

Tür, G .

SPEECH COMMUNICATION, 2000, 32 (1-2) :127-154

[7]

Stolcke A, 1999, P 6 EUR C SPEECH COM, V1, P307

[8]

TSENG CY, SPEECH COMMUNICATION, V46, P284

[9]

VERGYRI D, 2003, P ICASSP HONG KONG A, V1, P208

← 1 →