共 57 条
- [1] Brown TB, 2020, Arxiv, DOI [arXiv:2005.14165, DOI 10.48550/ARXIV.2005.14165]
- [2] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
- [3] Chen M, 2020, PR MACH LEARN RES, V119
- [4] Child R, 2019, Arxiv, DOI [arXiv:1904.10509, DOI 10.48550/ARXIV.1904.10509]
- [5] Chu XX, 2021, ADV NEUR IN
- [6] Deng YY, 2022, Arxiv, DOI arXiv:2105.14576
- [7] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
- [8] CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12114 - 12124
- [9] Dosovitskiy A., 2021, IMAGE IS WORTH 1616
- [10] Esser P, 2021, PROC IEEECVF C COMPU, P9