Speaker Adaptation using Relevance Vector Regression for HMM-based Expressive TTS

被引:0
|
作者
Hong, Doo Hwa [1 ]
Lee, Joun Yeop
Jang, Se Young
Kim, Nam Soo
机构
[1] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul, South Korea
来源
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年
关键词
speech synthesis; speaker adaptation; MLLR; relevance vector regression; LIKELIHOOD LINEAR-REGRESSION; KERNEL REGRESSION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The conventional maximum likelihood linear regression (MLLR)-based adaptation algorithm employed to acoustic hidden Markov models (HMMs) is too restricted in linear regression to represent the details of mapping charateristics. To overcome this problem, we propose the relevance vector regression (RVR)-based model parameter adaptation technique. In this framework, the conventional technique is extended to have much more basis functions. Also, the weights for conducting a transform matrix are obtained by sparse Bayesian learning, in which most of the weights become zero due to the definition of the prior with the precision hyper-parameters. Furthermore, by using the appropriate kernel functions, RVR can take both of the advantages of linear and nonlinear regression. In the experiments, the emotional speech database is used for adaptation to evaluate the proposed method compared with the conventional constrained MLLR. From the experimental results, we conclude that the RVR adaption method performs better than the conventional method.
引用
收藏
页码:1216 / 1220
页数:5
相关论文
共 50 条
  • [41] Basis-Based Speaker Adaptation Using Partitioned HMM Mean Parameters of Training Speaker Models
    Yongwon Jeong
    Journal of Signal Processing Systems, 2016, 82 : 303 - 310
  • [42] Basis-Based Speaker Adaptation Using Partitioned HMM Mean Parameters of Training Speaker Models
    Jeong, Yongwon
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2016, 82 (03): : 303 - 310
  • [43] Speaker adaptation using discriminative linear regression on time-varying mean parameters in trended HMM
    Chengalvarayan, R
    IEEE SIGNAL PROCESSING LETTERS, 1998, 5 (03) : 63 - 65
  • [44] DEVELOPMENT OF THE SLOVAK HMM-BASED TTS SYSTEM AND EVALUATION OF VOICES IN RESPECT TO THE USED VOCODING TECHNIQUES
    Sulir, Martin
    Juhar, Jozef
    Rusko, Milan
    COMPUTING AND INFORMATICS, 2016, 35 (06) : 1467 - 1490
  • [45] MULTI-SPEAKER MODELING AND SPEAKER ADAPTATION FOR DNN-BASED TTS SYNTHESIS
    Fan, Yuchen
    Qian, Yao
    Soong, Frank K.
    He, Lei
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4475 - 4479
  • [46] Tone correctness improvement in speaker dependent HMM-based Thai speech synthesis
    Chomphan, Suphattharachai
    Kobayashi, Takao
    SPEECH COMMUNICATION, 2008, 50 (05) : 392 - 404
  • [47] SIMPLE METHODS FOR IMPROVING SPEAKER-SIMILARITY OF HMM-BASED SPEECH SYNTHESIS
    Yamagishi, Junichi
    King, Simon
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4610 - 4613
  • [48] A Minimum V/U Error Approach to F0 Generation in HMM-based TTS
    Qian, Yao
    Soong, Frank
    Wang, Miaomiao
    Wu, Zhizheng
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 400 - 403
  • [49] Speaker Adaptation Using i-Vector Based Clustering
    Kim, Minsoo
    Jang, Gil-Jin
    Kim, Ji-Hwan
    Lee, Minho
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2020, 14 (07): : 2785 - 2799
  • [50] Thousands of Voices for HMM-Based Speech Synthesis-Analysis and Application of TTS Systems Built on Various ASR Corpora
    Yamagishi, Junichi
    Usabaev, Bela
    King, Simon
    Watts, Oliver
    Dines, John
    Tian, Jilei
    Guan, Yong
    Hu, Rile
    Oura, Keiichiro
    Wu, Yi-Jian
    Tokuda, Keiichi
    Karhila, Reima
    Kurimo, Mikko
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (05): : 984 - 1004