共 22 条
- [1] Baevski A., 2020, wav2vec 2.0: A Framework for SelfSupervised Learning of Speech Representations
- [2] Brown T. B., 2020, P 34 INT C NEUR INF
- [4] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
- [5] Dosovitskiy A., 2020, INT C LEARN REPR
- [6] Gemmeke J. F., 2017, 2017 IEEE INT C ACOU, P776
- [7] Gong Y, 2021, ARXIV211009784
- [9] He Kaiming, 2021, Masked autoencoders are scalable vision learners
- [10] HUBERT: HOW MUCH CAN A BAD TEACHER BENEFIT ASR PRE-TRAINING? [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6533 - 6537