共 55 条
- [1] ViViT: A Video Vision Transformer [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 6816 - 6826
- [2] Bertasius G, 2021, PR MACH LEARN RES, V139
- [3] Cai H., 2019, INT C LEARNING REPRE
- [4] Carion N., 2020, P EUR C COMP VIS GLA, P213, DOI DOI 10.1007/978-3-030-58452-813
- [5] Vision Transformer Slimming: Multi-Dimension Searching in Continuous Optimization Space [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4921 - 4931
- [6] CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 347 - 356
- [7] Chen M., 2022, ARXIV
- [8] Mobile-Former: Bridging MobileNet and Transformer [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5260 - 5269
- [9] Chu XX, 2021, ADV NEUR IN
- [10] Deformable Convolutional Networks [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 764 - 773