共 32 条
- [2] [Anonymous], 2017, LJ SPEECH DATASET
- [3] One-shot Voice Conversion by Separating Speaker and Content Representations with Instance Normalization [J]. INTERSPEECH 2019, 2019, : 664 - 668
- [5] Conformer: Convolution-augmented Transformer for Speech Recognition [J]. INTERSPEECH 2020, 2020, : 5036 - 5040
- [6] ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context [J]. INTERSPEECH 2020, 2020, : 3610 - 3614
- [7] Hsu W.-N., 2018, P INT C LEARN REPR
- [8] Hu TY, 2020, INT CONF ACOUST SPEE, P3267, DOI [10.1109/icassp40776.2020.9054591, 10.1109/ICASSP40776.2020.9054591]
- [9] StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice Conversion [J]. INTERSPEECH 2019, 2019, : 679 - 683
- [10] CopyCat: Many-to-Many Fine-Grained Prosody Transfer for Neural Text-to-Speech [J]. INTERSPEECH 2020, 2020, : 4387 - 4391