Hybrid Scalar/Vector Quantization of Mel-Frequency Cepstral Coefficients for Low Bit-Rate Coding of Speech

被引：4

作者：

Boucheron, Laura E. ^{[1
]}

De Leon, Phillip L. ^{[1
]}

Sandoval, Steven ^{[1
]}

机构：

[1] New Mexico State Univ, Klipsch Sch Elect & Comp Engn, Las Cruces, NM 88003 USA

来源：

2011 DATA COMPRESSION CONFERENCE (DCC) | 2011年

关键词：

RECOGNITION;

D O I：

10.1109/DCC.2011.17

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

In this paper, we propose a low bit-rate speech codec based on a hybrid scalar/vector quantization of the mel-frequency cepstral coefficients (MFCCs). We begin by showing that if a high-resolution mel-frequency cepstrum (MFC) is computed, good-quality speech reconstruction is possible from the MFCCs despite the lack of explicit phase information. By evaluating the contribution toward speech quality that individual MFCCs make and applying appropriate quantization, our results show perceptual evaluation of speech quality (PESQ) of the MFCC-based codec matches the state-of-the-art MELPe codec at 600 bps and exceeds the CELP codec at 2000-4000 bps coding rates. The main advantage of the proposed codec is in distributed speech recognition (DSR) since speech features based on MFCCs can be directly obtained from codewords thus eliminating additional decode and feature extract stages.

引用

页码：103 / 112

页数：10

共 20 条

[1]

[Anonymous], 2003, 202211 ETSI ES

[2]

[Anonymous], 1999, MILSTD3005

[3]

[Anonymous], 2001, Discrete-Time Speech Signal Processing:Principles and Practice

[4]

[Anonymous], 4591 STANAG N ATL TR

[5]

Boucheron L. E., 2008, P INT C SIGN EL SYST

[6]

Campbell J. P. Jr., 1991, Digital Signal Processing, V1, P145, DOI 10.1016/1051-2004(91)90106-U

[7]

Chamberlain M., 2001, P IEEE MILC C

[8]

Chazan D, 2000, INT CONF ACOUST SPEE, P1299, DOI 10.1109/ICASSP.2000.861816

[9] COMPARISON OF PARAMETRIC REPRESENTATIONS FOR MONOSYLLABIC WORD RECOGNITION IN CONTINUOUSLY SPOKEN SENTENCES [J].

DAVIS, SB ;

MERMELSTEIN, P .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (04) :357-366

[10] Evaluation of objective quality measures for speech enhancement [J].

Hu, Yi ;

Loizou, Philipos C. .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (01) :229-238

← 1 2 →