Discrete/Continuous Modelling of Speaking Style in HMM-based Speech Synthesis: Design and Evaluation

被引：0

作者：

Obin, Nicolas ^{[1
,2
]}

Lanchantin, Pierre ^{[1
]}

Lacheret, Anne ^{[2
]}

Rodet, Xavier ^{[1
]}

机构：

[1] IRCAM, Paris, France

[2] Univ Paris Ouest, Modyco Lab, Nanterre, France

来源：

12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 | 2011年

关键词：

speaking style; speech synthesis; speech prosody; average modelling;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper assesses the ability of a HMM-based speech synthesis systems to model the speech characteristics of various speaking styles(1). A discrete/continuous HAMM is presented to model the symbolic and acoustic speech characteristics of a speaking style. The proposed model is used to model the average characteristics of a speaking style that is shared among various speakers, depending on specific situations of speech communication. The evaluation consists of an identification experiment of 4 speaking styles based on delexicalized speech, and compared to a similar experiment on natural speech. The comparison is discussed and reveals that discrete/continuous HMM consistently models the speech characteristics of a speaking style.

引用

页码：2796 / +

页数：2

共 16 条

[1]

[Anonymous], 1999, P EUROSPEECH

[2]

Bell P., 2006, SPEECH PROSODY

[3] A COEFFICIENT OF AGREEMENT FOR NOMINAL SCALES [J].

COHEN, J .

EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1960, 20 (01) :37-46

[4]

Krstulovic S., 2007, INTERSPEECH

[5]

Lacheret A., 2010, LING ANN WORKSH UPPS

[6]

Obin N., 2010, SPEECH PROSODY

[7]

Obin N, 2010, 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, P3070

[8]

Obin N, 2010, 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, P1133

[9]

Schmid H., 2004, COLING 04, P659

[10]

Silverman Kim EA, 1992, Proceedings of the 1992 International Conference on Spoken Language Processing, V2, P867

← 1 2 →