Hybrid Scalar/Vector Quantization of Mel-Frequency Cepstral Coefficients for Low Bit-Rate Coding of Speech

被引：4

作者：

Boucheron, Laura E. ^{[1
]}

De Leon, Phillip L. ^{[1
]}

Sandoval, Steven ^{[1
]}

机构：

[1] New Mexico State Univ, Klipsch Sch Elect & Comp Engn, Las Cruces, NM 88003 USA

来源：

2011 DATA COMPRESSION CONFERENCE (DCC) | 2011年

关键词：

RECOGNITION;

D O I：

10.1109/DCC.2011.17

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

In this paper, we propose a low bit-rate speech codec based on a hybrid scalar/vector quantization of the mel-frequency cepstral coefficients (MFCCs). We begin by showing that if a high-resolution mel-frequency cepstrum (MFC) is computed, good-quality speech reconstruction is possible from the MFCCs despite the lack of explicit phase information. By evaluating the contribution toward speech quality that individual MFCCs make and applying appropriate quantization, our results show perceptual evaluation of speech quality (PESQ) of the MFCC-based codec matches the state-of-the-art MELPe codec at 600 bps and exceeds the CELP codec at 2000-4000 bps coding rates. The main advantage of the proposed codec is in distributed speech recognition (DSR) since speech features based on MFCCs can be directly obtained from codewords thus eliminating additional decode and feature extract stages.

引用

页码：103 / 112

页数：10

共 50 条

[21] One Solution of Extension of Mel-Frequency Cepstral Coefficients Feature Vector for Automatic Speaker Recognition
Jokic, Ivan D.
Jokic, Stevan D.
Delic, Vlado D.
Peric, Zoran H.
INFORMATION TECHNOLOGY AND CONTROL, 2020, 49 (02): : 224 - 236
[22] Analysis and prediction of acoustic speech features from mel-frequency cepstral coefficients in distributed speech recognition architectures
Darch, Jonathan
Milner, Ben
Vaseghi, Saeed
Journal of the Acoustical Society of America, 2009, 124 (06): : 3989 - 4000
[23] Analysis and prediction of acoustic speech features from mel-frequency cepstral coefficients in distributed speech recognition architectures
Darch, Jonathan
Milner, Ben
Vaseghi, Saeed
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2008, 124 (06): : 3989 - 4000
[24] Multiple-description predictive-vector quantization with applications to low bit-rate speech coding over networks
Yahampath, Pradeepa
Rondeau, Paul
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 749 - 755
[25] SIGNAL MODELS FOR LOW BIT-RATE CODING OF SPEECH
FLANAGAN, JL
ISHIZAKA, K
SHIPLEY, KL
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1980, 68 (03): : 780 - 791
[26] Techniques of very low bit-rate speech coding
Cui, HJ
Tang, K
Zhao, M
Zhang, X
CHINESE JOURNAL OF ELECTRONICS, 2004, 13 (01): : 63 - 65
[27] Joint Quantization Strategies for Low Bit-Rate Sinusoidal Coding
Unver, Emre
Villette, Stephane
Kondoz, Ahmet
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2571 - 2574
[28] Speech Reconstruction from Mel-frequency Cepstral Coefficients via l1-norm Minimization
Min, Gang
Zhang, Xiongwei
Yang, Jibin
Zou, Xia
2015 IEEE 17TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2015,
[29] Algorithm for speech emotion recognition classification based on Mel-frequency Cepstral coefficients and broad learning system
Zhiyou Yang
Ying Huang
Evolutionary Intelligence, 2022, 15 : 2485 - 2494
[30] Drive-by bridge damage detection using Mel-frequency cepstral coefficients and support vector machine
Li, Zhenkun
Lin, Weiwei
Zhang, Youqi
STRUCTURAL HEALTH MONITORING-AN INTERNATIONAL JOURNAL, 2023, 22 (05): : 3302 - 3319

← 1 2 3 4 5 →