共 28 条
[1]
[Anonymous], 2017, CSTR VCTK CORPUS ENG
[2]
SPEECHSPLIT2.0: UNSUPERVISED SPEECH DISENTANGLEMENT FOR VOICE CONVERSION WITHOUT TUNING AUTOENCODER BOTTLENECKS
[J].
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP),
2022,
:6332-6336
[3]
Chen T, 2020, PR MACH LEARN RES, V119
[4]
Fang FM, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), P5279, DOI 10.1109/ICASSP.2018.8462342
[5]
Gan W., 2022, IQDUBBING PROSODY MO
[6]
Conformer: Convolution-augmented Transformer for Speech Recognition
[J].
INTERSPEECH 2020,
2020,
:5036-5040
[7]
Huang W.-C., 2020, SEQUENCE TO SEQUENCE
[8]
ON PROSODY MODELING FOR ASR plus TTS BASED VOICE CONVERSION
[J].
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU),
2021,
:642-649
[9]
Direct speech-to-speech translation with a sequence-to-sequence model
[J].
INTERSPEECH 2019,
2019,
:1123-1127
[10]
Kameoka H, 2018, IEEE W SP LANG TECH, P266, DOI 10.1109/SLT.2018.8639535