Cross-Lingual Speaker Adaptation for Statistical Speech Synthesis Using Limited Data

被引:3
|
作者
Saffjoo, Seyyed Saeed [1 ]
Demiroglu, Cenk [1 ]
机构
[1] Ozyegin Univ, Elect & Comp Engn Dept, Istanbul, Turkey
关键词
statistical speech synthesis; speaker adaptation; nearest-neighbor; cross lingual speaker adaptation; eigenvoice adaptation;
D O I
10.21437/Interspeech.2016-345
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Cross-lingual speaker adaptation with limited adaptation data has many applications such as use in speech-to-speech transladon systems. Here, we focus on cross-lingual adaptation for statistical speech synthesis (SSS) systems using limited adaptation data. To that end, we propose two techniques exploiting a bilingual Turkish-English speech database that we collected. In one approach, speaker-specific state-mapping is proposed for cross-lingual adaptation which performed significantly better than the baseline state-mapping algorithm in adapting the excitation parameter both in objective and subjective tests. In the second approach, eigenvoice adaptation is done in the input language which is then used to estimate the eigenvoice weights in the output language using weighted linear regression. The second approach performed significantly better than the baseline system in adapting the spectral envelope parameters both in objective and subjective tests.
引用
收藏
页码:317 / 321
页数:5
相关论文
共 50 条
  • [1] Cross-lingual multi-speaker speech synthesis with limited bilingual training data
    Cai, Zexin
    Yang, Yaogen
    Li, Ming
    COMPUTER SPEECH AND LANGUAGE, 2023, 77
  • [2] Cross-lingual Speaker Adaptation using Domain Adaptation and Speaker Consistency Loss for Text-To-Speech Synthesis
    Xin, Detai
    Saito, Yuki
    Takamichi, Shinnosuke
    Koriyama, Tomoki
    Saruwatari, Hiroshi
    INTERSPEECH 2021, 2021, : 1614 - 1618
  • [3] Cross-lingual speaker adaptation using domain adaptation and speaker consistency loss for text-to-speech synthesis
    Xin, Detai
    Saito, Yuki
    Takamichi, Shinnosuke
    Koriyama, Tomoki
    Saruwatari, Hiroshi
    Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2021, 5 : 3376 - 3380
  • [4] CROSS-LINGUAL SPEAKER ADAPTATION FOR HMM-BASED SPEECH SYNTHESIS
    Wu, Yi-Jian
    King, Simon
    Tokuda, Keiichi
    2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, : 9 - 12
  • [5] UNSUPERVISED CROSS-LINGUAL SPEAKER ADAPTATION FOR HMM-BASED SPEECH SYNTHESIS
    Oura, Keiichiro
    Tokuda, Keiichi
    Yamagishi, Junichi
    King, Simon
    Wester, Mirjam
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4594 - 4597
  • [6] Using Eigenvoices and Nearest-Neighbors in HMM-Based Cross-Lingual Speaker Adaptation With Limited Data
    Sarfjoo, Seyyed Saeed
    Demiroglu, Cenk
    King, Simon
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (04) : 839 - 851
  • [7] Cross-lingual Speaker Adaptation for HMM-based Speech Synthesis based on Perceptual Characteristics and Speaker Interpolation
    Oliveira, Viviane de Franca
    Shiota, Sayaka
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 982 - 985
  • [8] Cross-Lingual Speaker Discrimination Using Natural and Synthetic Speech
    Wester, Mirjam
    Liang, Hui
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2492 - 2495
  • [9] Personalising speech-to-speech translation: Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis
    Dines, John
    Liang, Hui
    Saheer, Lakshmi
    Gibson, Matthew
    Byrne, William
    Oura, Keiichiro
    Tokuda, Keiichi
    Yamagishi, Junichi
    King, Simon
    Wester, Mirjam
    Hirsimaki, Teemu
    Karhila, Reima
    Kurimo, Mikko
    COMPUTER SPEECH AND LANGUAGE, 2013, 27 (02): : 420 - 437
  • [10] Cross-lingual, Multi-speaker Text-To-Speech Synthesis Using Neural Speaker Embedding
    Chen, Mengnan
    Chen, Minchuan
    Liang, Shuang
    Ma, Jun
    Chen, Lei
    Wang, Shaojun
    Xiao, Jing
    INTERSPEECH 2019, 2019, : 2105 - 2109