Hybrid Scalar/Vector Quantization of Mel-Frequency Cepstral Coefficients for Low Bit-Rate Coding of Speech

被引:4
|
作者
Boucheron, Laura E. [1 ]
De Leon, Phillip L. [1 ]
Sandoval, Steven [1 ]
机构
[1] New Mexico State Univ, Klipsch Sch Elect & Comp Engn, Las Cruces, NM 88003 USA
关键词
RECOGNITION;
D O I
10.1109/DCC.2011.17
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we propose a low bit-rate speech codec based on a hybrid scalar/vector quantization of the mel-frequency cepstral coefficients (MFCCs). We begin by showing that if a high-resolution mel-frequency cepstrum (MFC) is computed, good-quality speech reconstruction is possible from the MFCCs despite the lack of explicit phase information. By evaluating the contribution toward speech quality that individual MFCCs make and applying appropriate quantization, our results show perceptual evaluation of speech quality (PESQ) of the MFCC-based codec matches the state-of-the-art MELPe codec at 600 bps and exceeds the CELP codec at 2000-4000 bps coding rates. The main advantage of the proposed codec is in distributed speech recognition (DSR) since speech features based on MFCCs can be directly obtained from codewords thus eliminating additional decode and feature extract stages.
引用
收藏
页码:103 / 112
页数:10
相关论文
共 50 条
  • [21] One Solution of Extension of Mel-Frequency Cepstral Coefficients Feature Vector for Automatic Speaker Recognition
    Jokic, Ivan D.
    Jokic, Stevan D.
    Delic, Vlado D.
    Peric, Zoran H.
    INFORMATION TECHNOLOGY AND CONTROL, 2020, 49 (02): : 224 - 236
  • [22] Analysis and prediction of acoustic speech features from mel-frequency cepstral coefficients in distributed speech recognition architectures
    Darch, Jonathan
    Milner, Ben
    Vaseghi, Saeed
    Journal of the Acoustical Society of America, 2009, 124 (06): : 3989 - 4000
  • [23] Analysis and prediction of acoustic speech features from mel-frequency cepstral coefficients in distributed speech recognition architectures
    Darch, Jonathan
    Milner, Ben
    Vaseghi, Saeed
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2008, 124 (06): : 3989 - 4000
  • [24] Multiple-description predictive-vector quantization with applications to low bit-rate speech coding over networks
    Yahampath, Pradeepa
    Rondeau, Paul
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 749 - 755
  • [25] SIGNAL MODELS FOR LOW BIT-RATE CODING OF SPEECH
    FLANAGAN, JL
    ISHIZAKA, K
    SHIPLEY, KL
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1980, 68 (03): : 780 - 791
  • [26] Techniques of very low bit-rate speech coding
    Cui, HJ
    Tang, K
    Zhao, M
    Zhang, X
    CHINESE JOURNAL OF ELECTRONICS, 2004, 13 (01): : 63 - 65
  • [27] Joint Quantization Strategies for Low Bit-Rate Sinusoidal Coding
    Unver, Emre
    Villette, Stephane
    Kondoz, Ahmet
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2571 - 2574
  • [28] Speech Reconstruction from Mel-frequency Cepstral Coefficients via l1-norm Minimization
    Min, Gang
    Zhang, Xiongwei
    Yang, Jibin
    Zou, Xia
    2015 IEEE 17TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2015,
  • [29] Algorithm for speech emotion recognition classification based on Mel-frequency Cepstral coefficients and broad learning system
    Zhiyou Yang
    Ying Huang
    Evolutionary Intelligence, 2022, 15 : 2485 - 2494
  • [30] Drive-by bridge damage detection using Mel-frequency cepstral coefficients and support vector machine
    Li, Zhenkun
    Lin, Weiwei
    Zhang, Youqi
    STRUCTURAL HEALTH MONITORING-AN INTERNATIONAL JOURNAL, 2023, 22 (05): : 3302 - 3319