A Covariance-Tying Technique for HMM-Based Speech Synthesis

被引:10
作者
Oura, Keiichiro [1 ]
Zen, Heiga [1 ]
Nankaku, Yoshihiko [1 ]
Lee, Akinobu [1 ]
Tokuda, Keiichi [1 ]
机构
[1] Nagoya Inst Technol, Dept Comp Sci & Engn, Nagoya, Aichi 4668555, Japan
来源
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2010年 / E93D卷 / 03期
关键词
HMM; speech synthesis; decision tree; context-clustering; MDL criterion; embedded device;
D O I
10.1587/transinf.E93.D.595
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A technique for reducing the footprints of HMM-based speech synthesis systems by tying all covariance matrices of state distributions is described. HMM-based speech synthesis systems usually leave smaller footprints than unit-selection synthesis systems because they store statistics rather than speech waveforms. However, further reduction is essential to put them on embedded devices, which have limited memory. In accordance with the empirical knowledge that covariance matrices have a smaller impact on the quality of synthesized speech than mean vectors, we propose a technique for clustering mean vectors while tying all covariance matrices. Subjective listening test results showed that the proposed technique can shrink the footprints of an HMM-based speech synthesis system while retaining the quality of the synthesized speech.
引用
收藏
页码:595 / 601
页数:7
相关论文
共 50 条
  • [41] Roles of the Average Voice in Speaker-adaptive HMM-based Speech Synthesis
    Yamagishi, Junichi
    Watts, Oliver
    King, Simon
    Usabaev, Bela
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 418 - +
  • [42] Evaluation of Finnish Unit Selection and HMM-based Speech Synthesis
    Silen, Hanna
    Helander, Elina
    Nurminen, Jani
    Gabbouji, Moncef
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1853 - +
  • [43] SIMPLE METHODS FOR IMPROVING SPEAKER-SIMILARITY OF HMM-BASED SPEECH SYNTHESIS
    Yamagishi, Junichi
    King, Simon
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4610 - 4613
  • [44] Creation of HMM-based Speech Model for Estonian Text-to-Speech Synthesis
    Nurk, Tonis
    HUMAN LANGUAGE TECHNOLOGIES: THE BALTIC PERSPECTIVE, 2012, 247 : 162 - 168
  • [45] Feature-Space Transform Tying in Unified Acoustic-Articulatory Modelling for Articulatory Control of HMM-based Speech Synthesis
    Ling, Zhen-Hua
    Richmond, Korin
    Yamagishi, Junichi
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 124 - +
  • [46] Amplitude Spectrum based Excitation Model for HMM-based Speech Synthesis
    Wen, Zhengqi
    Tao, Jianhua
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1426 - 1429
  • [47] An HMM-based Speech-smile Synthesis System: An Approach for Amusement Synthesis
    El Haddad, Kevin
    Dupont, Stephane
    d'Alessandro, Nicolas
    Dutoit, Thierry
    2015 11TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG): EMOTION REPRESENTATION, ANALYSIS AND SYNTHESIS IN CONTINUOUS TIME AND SPACE (EMOSPACE 2015), VOL 5, 2015,
  • [48] HMM-based speech synthesis with various degrees of articulation: A perceptual study
    Picart, Benjamin
    Drugman, Thomas
    Dutoit, Thierry
    NEUROCOMPUTING, 2014, 132 : 142 - 147
  • [49] Soft context clustering for F0 modeling in HMM-based speech synthesis
    Khorram, Soheil
    Sameti, Hossein
    King, Simon
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2015,
  • [50] Nearest Neighbor Approach in Speaker Adaptation for HMM-based Speech Synthesis
    Mohammadi, Amir
    Demiroglu, Cenk
    2013 21ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2013,