Expressive control of singing voice synthesis using musical contexts and a parametric F0 model

被引：4

作者：

Ardaillon, Luc ^{[1
]}

Chabot-Canet, Celine ^{[1
]}

Roebel, Axel ^{[1
]}

机构：

[1] Sorbonne Univ, CNRS, IRCAM, UMR,STMS, Paris, France

来源：

17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年

关键词：

singing voice synthesis; singing style; F0; model;

D O I：

10.21437/Interspeech.2016-1317

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Expressive singing voice synthesis requires an appropriate control of both prosodic and timbral aspects. While it is desirable to have an intuitive control over the expressive parameters, synthesis systems should be able to produce convincing results directly from a score. As countless interpretations of a same score are possible, the system should also target a particular singing style, which implies to mimic the various strategies used by different singers. Among the control parameters involved, the pitch (F0) should be modeled in priority. In previous work, a parametric F0 model with intuitive controls has been proposed, but no automatic way to choose the model parameters was given. In the present work, we propose a new approach for modeling singing style, based on parametric templates selection. In this approach, the F0 parameters and phonemes durations are extracted from annotated recordings, along with a rich description of contextual informations, and stored to form a database of parametric templates. This database is then used to build a model of the singing style using decision-trees. At the synthesis stage, appropriate parameters are then selected according to the target contexts. The results produced by this approach have been evaluated by means of a listening test.

引用

页码：1250 / 1254

页数：5

共 9 条

[1] Development of an F0 control model based on F0 dynamic characteristics for singing-voice synthesis
Saitou, T
Unoki, M
Akagi, M
SPEECH COMMUNICATION, 2005, 46 (3-4) : 405 - 417
[2] A multi-layer F0 model for singing voice synthesis using a B-spline representation with intuitive controls
Ardaillon, Luc
Degottex, Gilles
Roebel, Axel
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3375 - +
[3] Parameter Estimation Method of F0 Control Model for Singing Voices
Ohishi, Yasunori
Kameoka, Hirokazu
Kashino, Kunio
Takeda, Kazuya
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 139 - +
[4] Sequential Generation of Singing F0 Contours from Musical Note Sequences Based on WaveNet
Wada, Yusuke
Nishikimi, Ryo
Nakamura, Eita
Itoyama, Katsutoshi
Yoshii, Kazuyoshi
2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 983 - 989
[5] Superpositional HMM-Based Intonation Synthesis Using a Functional F0 Model
Jinfu Ni
Yoshinori Shiga
Chiori Hori
Journal of Signal Processing Systems, 2016, 82 : 273 - 286
[6] Synthesis of F0 contours using generation process model parameters predicted from unlabeled corpora:: application to emotional speech synthesis
Hirose, K
Sato, K
Asano, Y
Minematsu, N
SPEECH COMMUNICATION, 2005, 46 (3-4) : 385 - 404
[7] An RNN-based Quantized F0 Model with Multi-tier Feedback Links for Text-to-Speech Synthesis
Wang, Xin
Takaki, Shinji
Yamagishi, Junichi
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1059 - 1063
[8] Voice quality control using perceptual expressions for statistical parametric speech synthesis based on cluster adaptive training
Ohtani, Yamato
Mori, Koichiro
Morita, Masahiro
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2258 - 2262
[9] EfficientSing: A Chinese Singing Voice Synthesis System Using Duration-Free Acoustic Model and HiFi-GAN Vocoder
Liu, Zhengchen
Miao, Chenfeng
Zhu, Qingying
Chen, Minchuan
Ma, Jun
Wang, Shaojun
Xiao, Jing
INTERSPEECH 2021, 2021, : 1609 - 1613

← 1 →