共 57 条
[41]
A Comparison of Sequence-to-Sequence Models for Speech Recognition
[J].
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION,
2017,
:939-943
[42]
Lower Frame Rate Neural Network Acoustic Models
[J].
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES,
2016,
:22-26
[44]
Sainath TN, 2020, INT CONF ACOUST SPEE, P6059, DOI [10.1109/icassp40776.2020.9054188, 10.1109/ICASSP40776.2020.9054188]
[45]
ADVANCING RNN TRANSDUCER TECHNOLOGY FOR SPEECH RECOGNITION
[J].
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021),
2021,
:5654-5658
[46]
Su H, 2013, INT CONF ACOUST SPEE, P6664, DOI 10.1109/ICASSP.2013.6638951
[47]
TRANSFORMER LANGUAGE MODELS WITH LSTM-BASED CROSS-UTTERANCE INFORMATION REPRESENTATION
[J].
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021),
2021,
:7363-7367
[48]
On the limit of English conversational speech recognition
[J].
INTERSPEECH 2021,
2021,
:2062-2066
[49]
Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard
[J].
INTERSPEECH 2020,
2020,
:551-555
[50]
Vaswani A, 2017, ADV NEUR IN, V30