共 50 条
[41]
Testing the Limits of Representation Mixing for Pronunciation Correction in End-to-End Speech Synthesis
[J].
INTERSPEECH 2020,
2020,
:4019-4023
[42]
ATTENTION-AUGMENTED END-TO-END MULTI-TASK LEARNING FOR EMOTION PREDICTION FROM SPEECH
[J].
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP),
2019,
:6705-6709
[43]
Insights on Neural Representations for End-to-End Speech Recognition
[J].
INTERSPEECH 2021,
2021,
:4079-4083
[45]
MINTZAI: End-to-end Deep Learning for Speech Translation
[J].
PROCESAMIENTO DEL LENGUAJE NATURAL,
2020, (65)
:97-100
[46]
Towards End-to-End Speech-to-Text Summarization
[J].
TEXT, SPEECH, AND DIALOGUE, TSD 2023,
2023, 14102
:304-316
[47]
A COMPARATIVE STUDY ON END-TO-END SPEECH TO TEXT TRANSLATION
[J].
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019),
2019,
:792-799
[48]
IMPROVING END-TO-END SPEECH SYNTHESIS WITH LOCAL RECURRENT NEURAL NETWORK ENHANCED TRANSFORMER
[J].
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING,
2020,
:6734-6738
[50]
Combination of end-to-end and hybrid models for speech recognition
[J].
INTERSPEECH 2020,
2020,
:1783-1787