An Analysis of Language Mismatch in HMM State Mapping-Based Cross-Lingual Speaker Adaptation

被引:0
|
作者
Liang, Hui [1 ]
Dines, John [1 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
来源
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2 | 2010年
关键词
HMM-based TTS; cross-lingual speaker adaptation; HMM state mapping; language mismatch;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper provides an in-depth analysis of the impacts of language mismatch on the performance of cross-lingual speaker adaptation. Our work confirms the influence of language mismatch between average voice distributions for synthesis and for transform estimation and the necessity of eliminating this mismatch in order to effectively utilize multiple transforms for cross-lingual speaker adaptation. Specifically, we show that language mismatch introduces unwanted language-specific information when estimating multiple transforms, thus making these transforms detrimental to adaptation performance. Our analysis demonstrates speaker characteristics should be separated from language characteristics in order to improve cross-lingual adaptation performance.
引用
收藏
页码:622 / 625
页数:4
相关论文
共 13 条
  • [1] Phonological Knowledge Guided HMM State Mapping for Cross-Lingual Speaker Adaptation
    Liang, Hui
    Dines, John
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1836 - +
  • [2] State mapping based method for cross-lingual speaker adaptation in HMM-based speech synthesis
    Wu, Yi-Jian
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 516 - 519
  • [3] Cross-lingual Speaker Adaptation for HMM-based Speech Synthesis based on Perceptual Characteristics and Speaker Interpolation
    Oliveira, Viviane de Franca
    Shiota, Sayaka
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 982 - 985
  • [4] Analysis of unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis using KLD-based transform mapping
    Oura, Keiichiro
    Yamagishi, Junichi
    Wester, Mirjam
    King, Simon
    Tokuda, Keiichi
    SPEECH COMMUNICATION, 2012, 54 (06) : 703 - 714
  • [5] A COMPARISON OF SUPERVISED AND UNSUPERVISED CROSS-LINGUAL SPEAKER ADAPTATION APPROACHES FOR HMM-BASED SPEECH SYNTHESIS
    Liang, Hui
    Dines, John
    Saheer, Lakshmi
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4598 - 4601
  • [6] Cross-lingual speaker adaptation for HMM-based speech synthesis considering differences between language-dependent average voices
    Peng, Xianglin
    Oura, Keiichiro
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III, 2010, : 605 - 608
  • [7] Using Eigenvoices and Nearest-Neighbors in HMM-Based Cross-Lingual Speaker Adaptation With Limited Data
    Sarfjoo, Seyyed Saeed
    Demiroglu, Cenk
    King, Simon
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (04) : 839 - 851
  • [8] A FRAME MAPPING BASED HMM APPROACH TO CROSS-LINGUAL VOICE TRANSFORMATION
    Qian, Yao
    Xu, Ji
    Soong, Frank K.
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5120 - 5123
  • [9] Personalising speech-to-speech translation: Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis
    Dines, John
    Liang, Hui
    Saheer, Lakshmi
    Gibson, Matthew
    Byrne, William
    Oura, Keiichiro
    Tokuda, Keiichi
    Yamagishi, Junichi
    King, Simon
    Wester, Mirjam
    Hirsimaki, Teemu
    Karhila, Reima
    Kurimo, Mikko
    COMPUTER SPEECH AND LANGUAGE, 2013, 27 (02) : 420 - 437
  • [10] STATE MAPPING FOR CROSS-LANGUAGE SPEAKER ADAPTATION IN TTS
    Chen, Yi-Ning
    Jiao, Yang
    Qian, Yao
    Soong, Frank K.
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4273 - 4276