共 58 条
- [1] Arik SÖ, 2017, ADV NEUR IN, V30
- [2] Arik SÖ, 2018, ADV NEUR IN, V31
- [3] Baevski A, 2020, ADV NEUR IN, V33
- [4] Baevski A, 2020, Arxiv, DOI arXiv:1910.05453
- [5] Behre C., 2017, The relationship between fundamental frequency variation and articulation in healthy speech production
- [6] Casanova E, 2022, PR MACH LEARN RES
- [7] SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model [J]. INTERSPEECH 2021, 2021, : 3645 - 3649
- [9] Chen M., 2021, P INT C LEARN REPR
- [10] Cross-lingual, Multi-speaker Text-To-Speech Synthesis Using Neural Speaker Embedding [J]. INTERSPEECH 2019, 2019, : 2105 - 2109