A very low bit rate speech coder using HMM-based speech recognition synthesis techniques

被引:0
|
作者
Tokuda, K [1 ]
Masuko, T [1 ]
Hiroi, J [1 ]
Kobayashi, T [1 ]
Kitamura, T [1 ]
机构
[1] Nagoya Inst Technol, Dept Comp Sci, Nagoya, Aichi 466, Japan
来源
PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6 | 1998年
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a very low bit rate speech coder based on HMM (Hidden Markov Model). The encoder carries out phoneme recognition, and transmits phoneme indexes, state durations and pitch information to the decoder. In the decoder, phoneme HMMs are concatenated according to the phoneme indexes, and a sequence of mel-cepstral coefficient vectors is generated from the concatenated HMM by using an ML-based speech parameter generation technique. Finally we obtain synthetic speech by exciting the MLSA (Mel Log Spectrum Approximation) filter, whose coefficients are given by mel-cepstral coefficients, according to the pitch information. A subjective listening test shows that the performance of the proposed coder at about 150 bit/s (for the test data including 26% silence region) is comparable to a VQ-based vocoder at 400 bit/s (= 8 bit/frame x 50 frame/s) without pitch quantization for both coders.
引用
收藏
页码:609 / 612
页数:4
相关论文
共 50 条
  • [11] Lost Speech Reconstruction Method using Speech Recognition based on Missing Feature Theory and HMM-based Speech Synthesis
    Kuroiwa, Shingo
    Tsuge, Satoru
    Ren, Fuji
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1105 - 1108
  • [12] Croatian HMM-based speech synthesis
    Department of Informatics, Faculty of Philosophy, University of Rijeka, Omladinska 14, Rijeka
    51000, Croatia
    J. Compt. Inf. Technol., 2006, 4 (307-313):
  • [13] HMM-Based Vietnamese Speech Synthesis
    Trinh Quoc Son
    2015 IEEE/ACIS 14TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 2015, : 349 - 353
  • [14] Czech HMM-Based Speech Synthesis
    Hanzlicek, Zdenek
    TEXT, SPEECH AND DIALOGUE, 2010, 6231 : 291 - 298
  • [15] Robustness of HMM-based Speech Synthesis
    Yamagishi, Junichi
    Ling, Zhenhua
    King, Simon
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 581 - 584
  • [16] Very low bit rate speech coding using a diphone-based recognition and synthesis approach
    Felici, M
    Borgatti, M
    Guerrieri, R
    ELECTRONICS LETTERS, 1998, 34 (09) : 859 - 860
  • [17] Arabic HMM-based Speech Synthesis
    Khalil, Krichi Mohamed
    Adnan, Cherif
    2013 INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND SOFTWARE APPLICATIONS (ICEESA), 2013, : 450 - 454
  • [18] Speaker Adaptation using Nonlinear Regression Techniques for HMM-based Speech Synthesis
    Hong, Doo Hwa
    Kang, Shin Jae
    Lee, Joun Yeop
    Kim, Nam Soo
    2014 TENTH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING (IIH-MSP 2014), 2014, : 586 - 589
  • [19] Peripheral features for HMM-based speech recognition
    Fukuda, T
    Takigawa, M
    Nitta, T
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 129 - 132
  • [20] HMM-Based Vietnamese Speech Synthesis
    Trinh, Son
    Hoang, Kiem
    INTERNATIONAL JOURNAL OF SOFTWARE INNOVATION, 2015, 3 (04) : 33 - 47