共 75 条
[1]
Arik SO, 2017, PR MACH LEARN RES, V70
[2]
Arjovsky M., 2019, P INT C LEARN REPR
[3]
ONE TTS ALIGNMENT TO RULE THEM ALL
[J].
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP),
2022,
:6092-6096
[4]
Baevski A, 2020, ADV NEUR IN, V33
[5]
Bernard M., 2021, Journal of Open Source Software, V6, DOI DOI 10.21105/JOSS.03958
[6]
Black A. W., 2005, P INTERSPEECH, P77, DOI 10.21437/Interspeech.2005-72
[7]
Black AW, 2007, INT CONF ACOUST SPEE, P1229
[8]
XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model
[J].
INTERSPEECH 2024,
2024,
:4978-4982
[9]
MultiSpeech: Multi-Speaker Text to Speech with Transformer
[J].
INTERSPEECH 2020,
2020,
:4024-4028
[10]
SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
[J].
INTERSPEECH 2022,
2022,
:1-5