共 34 条
[1]
[Anonymous], 2016, ARXIV PREPR ARXIV160, DOI DOI 10.48550/ARXIV.1609.03499
[2]
Arik SÖ, 2017, ADV NEUR IN, V30
[3]
Casanova E, 2022, PR MACH LEARN RES
[4]
Chen M., 2020, PROC INT C LEARN REP
[5]
MultiSpeech: Multi-Speaker Text to Speech with Transformer
[J].
INTERSPEECH 2020,
2020,
:4024-4028
[6]
INVESTIGATING ON INCORPORATING PRETRAINED AND LEARNABLE SPEAKER REPRESENTATIONS FOR MULTI-SPEAKER MULTI-STYLE TEXT-TO-SPEECH
[J].
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021),
2021,
:8588-8592
[8]
Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length Embedding
[J].
INTERSPEECH 2020,
2020,
:2007-2011
[9]
Xception: Deep Learning with Depthwise Separable Convolutions
[J].
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017),
2017,
:1800-1807
[10]
ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN Based Speaker Verification
[J].
INTERSPEECH 2020,
2020,
:3830-3834