共 22 条
[1]
Direct Acoustics-to-Word Models for English Conversational Speech Recognition
[J].
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION,
2017,
:959-963
[2]
Bengio S, 2015, ADV NEUR IN, V28
[3]
Chan W, 2016, INT CONF ACOUST SPEE, P4960, DOI 10.1109/ICASSP.2016.7472621
[4]
Chan William, 2021, ABS210402133 CORR
[5]
Chorowski J, 2015, ADV NEUR IN, V28
[6]
W2V-BERT: COMBINING CONTRASTIVE LEARNING AND MASKED LANGUAGE MODELING FOR SELF-SUPERVISED SPEECH PRE-TRAINING
[J].
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU),
2021,
:244-250
[7]
Goyal P, 2018, Arxiv, DOI arXiv:1706.02677
[8]
Conformer: Convolution-augmented Transformer for Speech Recognition
[J].
INTERSPEECH 2020,
2020,
:5036-5040
[9]
Inan H., 2017, ICLR, P1
[10]
Kingma D. P., 2015, INT C LEARN REPR ICL, P1