A Covariance-Tying Technique for HMM-Based Speech Synthesis

被引:10
作者
Oura, Keiichiro [1 ]
Zen, Heiga [1 ]
Nankaku, Yoshihiko [1 ]
Lee, Akinobu [1 ]
Tokuda, Keiichi [1 ]
机构
[1] Nagoya Inst Technol, Dept Comp Sci & Engn, Nagoya, Aichi 4668555, Japan
来源
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2010年 / E93D卷 / 03期
关键词
HMM; speech synthesis; decision tree; context-clustering; MDL criterion; embedded device;
D O I
10.1587/transinf.E93.D.595
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A technique for reducing the footprints of HMM-based speech synthesis systems by tying all covariance matrices of state distributions is described. HMM-based speech synthesis systems usually leave smaller footprints than unit-selection synthesis systems because they store statistics rather than speech waveforms. However, further reduction is essential to put them on embedded devices, which have limited memory. In accordance with the empirical knowledge that covariance matrices have a smaller impact on the quality of synthesized speech than mean vectors, we propose a technique for clustering mean vectors while tying all covariance matrices. Subjective listening test results showed that the proposed technique can shrink the footprints of an HMM-based speech synthesis system while retaining the quality of the synthesized speech.
引用
收藏
页码:595 / 601
页数:7
相关论文
共 50 条
  • [31] Continuous Control of the Degree of Articulation in HMM-based Speech Synthesis
    Picart, Benjamin
    Drugman, Thomas
    Dutoit, Thierly
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1808 - 1811
  • [32] Quality Assessment of HMM-Based Speech Synthesis Using Acoustical Vowel Analysis
    Coto-Jimenez, Marvin
    Goddard-Close, John
    Martinez-Licona, Fabiola M.
    SPEECH AND COMPUTER, 2014, 8773 : 368 - 375
  • [33] x Formant-controlled HMM-based Speech Synthesis
    Lei, Ming
    Yamagishi, Junichi
    Richmond, Korin
    Ling, Zhen-Hua
    King, Simon
    Dai, Li-Rong
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2788 - +
  • [34] HMM-Based Speech Synthesis Utilizing Glottal Inverse Filtering
    Raitio, Tuomo
    Suni, Antti
    Yamagishi, Junichi
    Pulakka, Hannu
    Nurminen, Jani
    Vainio, Martti
    Alku, Paavo
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (01): : 153 - 165
  • [35] Extended Decision Tree with OR Relationship for HMM-based Speech Synthesis
    Wang, Yang
    Tao, Jianhua
    Yang, Minghao
    Li, Ya
    2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013), 2013, : 225 - 229
  • [36] HMM-based Korean speech synthesis system for hand-held devices
    Kim, Sang-Jin
    Kim, Jong-Jin
    Hahn, Minsoo
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2006, 52 (04) : 1384 - 1390
  • [37] Integrating Articulatory Features Into HMM-Based Parametric Speech Synthesis
    Ling, Zhen-Hua
    Richmond, Korin
    Yamagishi, Junichi
    Wang, Ren-Hua
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (06): : 1171 - 1185
  • [38] Evaluation of Prosodic Contextual Factors for HMM-based Speech Synthesis
    Yokomizo, Shuji
    Nose, Takashi
    Kobayashi, Takao
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 430 - 433
  • [39] Roles of the Average Voice in Speaker-adaptive HMM-based Speech Synthesis
    Yamagishi, Junichi
    Watts, Oliver
    King, Simon
    Usabaev, Bela
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 418 - +
  • [40] Improved Training of Excitation for HMM-based Parametric Speech Synthesis
    Shiga, Yoshinori
    Toda, Tomoki
    Sakai, Shinsuke
    Kawai, Hisashi
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 809 - 812