Speaker Adaptation using Relevance Vector Regression for HMM-based Expressive TTS

被引：0

作者：

Hong, Doo Hwa ^{[1
]}

Lee, Joun Yeop

Jang, Se Young

Kim, Nam Soo

机构：

[1] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul, South Korea

来源：

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年

关键词：

speech synthesis; speaker adaptation; MLLR; relevance vector regression; LIKELIHOOD LINEAR-REGRESSION; KERNEL REGRESSION;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The conventional maximum likelihood linear regression (MLLR)-based adaptation algorithm employed to acoustic hidden Markov models (HMMs) is too restricted in linear regression to represent the details of mapping charateristics. To overcome this problem, we propose the relevance vector regression (RVR)-based model parameter adaptation technique. In this framework, the conventional technique is extended to have much more basis functions. Also, the weights for conducting a transform matrix are obtained by sparse Bayesian learning, in which most of the weights become zero due to the definition of the prior with the precision hyper-parameters. Furthermore, by using the appropriate kernel functions, RVR can take both of the advantages of linear and nonlinear regression. In the experiments, the emotional speech database is used for adaptation to evaluate the proposed method compared with the conventional constrained MLLR. From the experimental results, we conclude that the RVR adaption method performs better than the conventional method.

引用

页码：1216 / 1220

页数：5

共 50 条

[11] An On-line Speaker Adaptation Method for HMM-based Speech Recognizers
Banhalmi, Andras
Kocsor, Andras
ACTA CYBERNETICA, 2008, 18 (03): : 379 - 390
[12] CROSS-LINGUAL SPEAKER ADAPTATION FOR HMM-BASED SPEECH SYNTHESIS
Wu, Yi-Jian
King, Simon
Tokuda, Keiichi
2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, : 9 - 12
[13] Using Eigenvoices and Nearest-Neighbors in HMM-Based Cross-Lingual Speaker Adaptation With Limited Data
Sarfjoo, Seyyed Saeed
Demiroglu, Cenk
King, Simon
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (04) : 839 - 851
[14] Sinusoidal model parameterization for HMM-based TTS system
Shechtman, Slava
Sorin, Alex
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 805 - 808
[15] A Perceptual Study of Acceleration Parameters in HMM-based TTS
Chen, Yi-Ning
Yan, Zhi-Jie
Soong, Frank K.
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 426 - +
[16] Measuring the gap between HMM-based ASR and TTS
Dines, John
Yamagishi, Junichi
King, Simon
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1411 - +
[17] Speaker and style adaptation using average voice model for style control in HMM-based speech synthesis
Tachibana, Makoto
Izawa, Shinsuke
Nose, Takashi
Kobayashi, Takao
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4633 - 4636
[18] Speaker adaptation method for acoustic-to-articulatory inversion using an HMM-based speech production model
Hiroya, S
Honda, M
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (05): : 1071 - 1078
[19] HMM-based Speaker Characteristics Emphasis Using Average Voice Model
Nose, Takashi
Adada, Junichi
Kobayashi, Takao
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2599 - 2602
[20] HMM-based TTS for Hanoi Vietnamese: issues in design and evaluation
Nguyen Thi Thu Trang
D'Alessandro, Christophe
Rilliard, Albert
Tran Do Dat
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2310 - 2314

← 1 2 3 4 5 →