Discrete/Continuous Modelling of Speaking Style in HMM-based Speech Synthesis: Design and Evaluation

被引:0
作者
Obin, Nicolas [1 ,2 ]
Lanchantin, Pierre [1 ]
Lacheret, Anne [2 ]
Rodet, Xavier [1 ]
机构
[1] IRCAM, Paris, France
[2] Univ Paris Ouest, Modyco Lab, Nanterre, France
来源
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 | 2011年
关键词
speaking style; speech synthesis; speech prosody; average modelling;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper assesses the ability of a HMM-based speech synthesis systems to model the speech characteristics of various speaking styles(1). A discrete/continuous HAMM is presented to model the symbolic and acoustic speech characteristics of a speaking style. The proposed model is used to model the average characteristics of a speaking style that is shared among various speakers, depending on specific situations of speech communication. The evaluation consists of an identification experiment of 4 speaking styles based on delexicalized speech, and compared to a similar experiment on natural speech. The comparison is discussed and reveals that discrete/continuous HMM consistently models the speech characteristics of a speaking style.
引用
收藏
页码:2796 / +
页数:2
相关论文
共 16 条
[1]  
[Anonymous], 1999, P EUROSPEECH
[2]  
Bell P., 2006, SPEECH PROSODY
[3]   A COEFFICIENT OF AGREEMENT FOR NOMINAL SCALES [J].
COHEN, J .
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1960, 20 (01) :37-46
[4]  
Krstulovic S., 2007, INTERSPEECH
[5]  
Lacheret A., 2010, LING ANN WORKSH UPPS
[6]  
Obin N., 2010, SPEECH PROSODY
[7]  
Obin N, 2010, 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, P3070
[8]  
Obin N, 2010, 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, P1133
[9]  
Schmid H., 2004, COLING 04, P659
[10]  
Silverman Kim EA, 1992, Proceedings of the 1992 International Conference on Spoken Language Processing, V2, P867