UNSUPERVISED LEARNING FOR MULTI-STYLE SPEECH SYNTHESIS WITH LIMITED DATA

被引:0
|
作者
Liang, Shuang [1 ]
Miao, Chenfeng [1 ]
Chen, Minchuan [1 ]
Ma, Jun [1 ]
Wang, Shaojun [1 ]
Xiao, Jing [1 ]
机构
[1] Ping An Technol, Shenzhen, Peoples R China
来源
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年
关键词
speech synthesis; unsupervised; instance discriminator; information bottleneck;
D O I
10.1109/ICASSP39728.2021.9414220
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Existing multi-style speech synthesis methods require either style labels or large amounts of unlabeled training data, making data acquisition difficult. In this paper, we present an unsupervised multi-style speech synthesis method that can be trained with limited data. We leverage instance discriminator to guide a style encoder to learn meaningful style representations from a multi-style dataset. Furthermore, we employ information bottleneck to filter out style-irrelevant information in the representations, which can improve speech quality and style similarity. Our method is able to produce desirable speech using a fairly small dataset, where the baseline GST-Tacotron fails. ABX tests show that our model significantly outperforms GST-Tacotron in both emotional speech synthesis task and multi-speaker speech synthesis task. In addition, we demonstrate that our method is able to learn meaningful style features with only 50 training samples per style.
引用
收藏
页码:6583 / 6587
页数:5
相关论文
共 50 条
  • [21] Multi-Style Generative Reading Comprehension
    Nishida, Kyosuke
    Saito, Itsumi
    Nishida, Kosuke
    Shinoda, Kazutoshi
    Otsuka, Atsushi
    Asano, Hisako
    Tomita, Junji
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 2273 - 2284
  • [22] Interactive Artistic Multi-style Transfer
    Wang, Xiaohui
    Lyu, Yiran
    Huang, Junfeng
    Wang, Ziying
    Qin, Jingyan
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2021, 14 (01)
  • [23] Interactive Artistic Multi-style Transfer
    Xiaohui Wang
    Yiran Lyu
    Junfeng Huang
    Ziying Wang
    Jingyan Qin
    International Journal of Computational Intelligence Systems, 14
  • [24] Design of a Multi-Style and Multi-Frequency FPGA
    Manoranjan, Jotham Vaddaboina
    Sajjan, Solomon Surya Tej Mano
    Gujari, Vivek B.
    Stevens, Kenneth S.
    2016 IFIP/IEEE INTERNATIONAL CONFERENCE ON VERY LARGE SCALE INTEGRATION (VLSI-SOC), 2016,
  • [25] Multi-style learning with denoising autoencoders for acoustic modeling in the internet of things (IoT)
    Lin, Payton
    Lyu, Dau-Cheng
    Chen, Fei
    Wang, Syu-Siang
    Tsao, Yu
    COMPUTER SPEECH AND LANGUAGE, 2017, 46 : 481 - 495
  • [26] Image Style Transfer via Multi-Style Geometry Warping
    Alexandru, Ioana
    Nicula, Constantin
    Prodan, Cristian
    Rotaru, Razvan-Paul
    Tarba, Nicolae
    Boiangiu, Costin-Anton
    APPLIED SCIENCES-BASEL, 2022, 12 (12):
  • [27] FPGA architecture for multi-style asynchronous logic
    Huot, N
    Dubreuil, H
    Fesquet, L
    Renaudin, M
    DESIGN, AUTOMATION AND TEST IN EUROPE CONFERENCE AND EXHIBITION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 32 - 33
  • [28] MSN: Multi-Style Network for Trajectory Prediction
    Wong, Conghao
    Xia, Beihao
    Peng, Qinmu
    Yuan, Wei
    You, Xinge
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (09) : 9751 - 9766
  • [29] INVESTIGATING ON INCORPORATING PRETRAINED AND LEARNABLE SPEAKER REPRESENTATIONS FOR MULTI-SPEAKER MULTI-STYLE TEXT-TO-SPEECH
    Chien, Chung-Ming
    Lin, Jheng-Hao
    Huang, Chien-yu
    Hsu, Po-chun
    Lee, Hung-yi
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 8588 - 8592
  • [30] Domain Generalization for Mammography Detection via Multi-style and Multi-view Contrastive Learning
    Li, Zheren
    Cui, Zhiming
    Wang, Sheng
    Qi, Yuji
    Ouyang, Xi
    Chen, Qitian
    Yang, Yuezhi
    Xue, Zhong
    Shen, Dinggang
    Cheng, Jie-Zhi
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT VII, 2021, 12907 : 98 - 108