Ultra low bit rate speech coding using an ergodic hidden Markov model

被引：0

作者：

Lee, ME ^{[1
]}

Durey, AS ^{[1
]}

Moore, E ^{[1
]}

Clements, M ^{[1
]}

机构：

[1] Georgia Inst Technol, Sch Elect & Comp Engn, Ctr Signal & Image Proc, Atlanta, GA 30332 USA

来源：

2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING | 2005年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents the framework for an ultra low bit rate speech vocoder. The system is based on a recognition-synthesis paradigm in which a single ergodic hidden Markov model (EHMM) is used to capture the statistical characterizations of speech in a flexible manner capable of limiting the effects of recognition errors. Because predetermined speech units are not used, this system has the advantage of not requiring a transcription for the training data set. By incorporating a mixed excitation scheme based on an improved MELP formulation into the EHMM, additional gains in quality and speaker characterization are achieved at no cost to the bit rate.

引用

页码：765 / 768

页数：4

共 12 条

[1] [Anonymous], 2003, CMULTI03177
[2] ERTAN AE, 2003, THESIS GEORGIA I TEC
[3] Farges E. P., 1986, ICASSP 86 Proceedings. IEEE-IECEJ-ASJ International Conference on Acoustics, Speech and Signal Processing (Cat. No.86CH2243-4), P433
[4] Fukada T., 1992, ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech and Signal Processing (Cat. No.92CH3103-9), P137, DOI 10.1109/ICASSP.1992.225953
[5] MAIA RD, 2003, ICASSP, P796
[6] McCree A, 1998, INT CONF ACOUST SPEE, P593, DOI 10.1109/ICASSP.1998.675334
[7] PEPPER DJ, 1991, INT CONF ACOUST SPEE, P465, DOI 10.1109/ICASSP.1991.150377
[8] PEPPER DJ, 1990, THESIS GEORGIA I TEC
[9] RIBEIRO CM, 2000, ICSLP, P830
[10] ROUCOS S, 1982, ICASSP, P582

← 1 2 →