Low bit-rate speech coding with predictive multi-level vector quantization

被引:0
|
作者
Yu, Xingye [1 ]
Li, Ye [1 ]
Zhang, Peng [1 ]
Lin, Lingxia [1 ]
Cai, Tianyu [1 ]
机构
[1] Qilu Univ Technol, Shandong Acad Sci, Shandong Prov Key Lab Comp Networks, Natl Supercomp Ctr Jinan,Shandong Comp Sci Ctr, Jinan 250014, Peoples R China
关键词
Speech coding; Predictive multi-level vector quantization; Full-band feature extractor;
D O I
10.1016/j.apacoust.2025.110538
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
During the development of modern communication technology, although wideband speech coding can provide high-fidelity speech transmission, its high bandwidth requirements limit its application in resource-constrained environments. Narrowband speech coding still holds research value. However, traditional narrowband low bit- rate speech coding methods usually cannot generate satisfactory speech quality. To address this issue, this paper proposes a narrowband low bit-rate speech coding architecture called PMVQCodec, with the following major improvements. Firstly, we design a predictive multi-level vector quantization (PMVQ) technique, which employs a predictor to effectively capture the correlations between latent frame vectors and combines it with multilevel vector quantization to enhance quantization efficiency. Additionally, we also introduce a full-band feature extractor to effectively reduce the computational complexity. In our experiments, both subjective and objective evaluations demonstrated the effectiveness of the proposed PMVQCodec architecture. Our proposed method can achieve higher quality reconstructed speech than Encodec and HiFiCodec at 1.2 kbps and 2.4 kbps, and even outperforms LyraV2 at 6 kbps.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Pitch quantization in low bit-rate speech coding
    Eriksson, T
    Kang, HG
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 489 - 492
  • [2] Multiple-description predictive-vector quantization with applications to low bit-rate speech coding over networks
    Yahampath, Pradeepa
    Rondeau, Paul
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 749 - 755
  • [3] LOW BIT-RATE VIDEO CODING USING WAVELET VECTOR QUANTIZATION
    SAMPSON, DG
    DASILVA, EAB
    GHANBARI, M
    IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 1995, 142 (03): : 141 - 148
  • [4] Hybrid Scalar/Vector Quantization of Mel-Frequency Cepstral Coefficients for Low Bit-Rate Coding of Speech
    Boucheron, Laura E.
    De Leon, Phillip L.
    Sandoval, Steven
    2011 DATA COMPRESSION CONFERENCE (DCC), 2011, : 103 - 112
  • [5] Low bit-rate vector quantization with progressive transmission
    Poggi, G
    EUROPEAN TRANSACTIONS ON TELECOMMUNICATIONS, 1997, 8 (02): : 171 - 177
  • [6] Steganography in vector quantization process of linear predictive coding for low-bit-rate speech codec
    Peng Liu
    Songbin Li
    Haiqiang Wang
    Multimedia Systems, 2017, 23 : 485 - 497
  • [7] Steganography integrated into linear predictive coding for low bit-rate speech codec
    Peng Liu
    Songbin Li
    Haiqiang Wang
    Multimedia Tools and Applications, 2017, 76 : 2837 - 2859
  • [8] Steganography integrated into linear predictive coding for low bit-rate speech codec
    Liu, Peng
    Li, Songbin
    wang, Haiqiang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (02) : 2837 - 2859
  • [9] Steganography in vector quantization process of linear predictive coding for low-bit-rate speech codec
    Liu, Peng
    Li, Songbin
    Wang, Haiqiang
    MULTIMEDIA SYSTEMS, 2017, 23 (04) : 485 - 497
  • [10] SIGNAL MODELS FOR LOW BIT-RATE CODING OF SPEECH
    FLANAGAN, JL
    ISHIZAKA, K
    SHIPLEY, KL
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1980, 68 (03): : 780 - 791