共 50 条
- [1] Phoneme Duration Modeling Using Speech Rhythm-Based Speaker Embeddings for Multi-Speaker Speech Synthesis INTERSPEECH 2021, 2021, : 3141 - 3145
- [2] PHONEME DEPENDENT SPEAKER EMBEDDING AND MODEL FACTORIZATION FOR MULTI-SPEAKER SPEECH SYNTHESIS AND ADAPTATION 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6930 - 6934
- [3] DNN based multi-speaker speech synthesis with temporal auxiliary speaker ID embedding 2019 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2019, : 61 - 64
- [4] An Unsupervised Method to Select a Speaker Subset from Large Multi-Speaker Speech Synthesis Datasets INTERSPEECH 2020, 2020, : 1758 - 1762
- [5] ZERO-SHOT MULTI-SPEAKER TEXT-TO-SPEECH WITH STATE-OF-THE-ART NEURAL SPEAKER EMBEDDINGS 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6184 - 6188
- [7] Normalization Driven Zero-shot Multi-Speaker Speech Synthesis INTERSPEECH 2021, 2021, : 1354 - 1358
- [8] Multi-speaker Multi-style Speech Synthesis with Timbre and Style Disentanglement MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2022, 2023, 1765 : 132 - 140
- [9] Cross-lingual, Multi-speaker Text-To-Speech Synthesis Using Neural Speaker Embedding INTERSPEECH 2019, 2019, : 2105 - 2109
- [10] Training Multi-Speaker Neural Text-to-Speech Systems using Speaker-Imbalanced Speech Corpora INTERSPEECH 2019, 2019, : 1303 - 1307