共 41 条
[1]
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
[J].
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021),
2021,
:347-356
[2]
Chen K., 2019, CoRR abs/1906.07155
[4]
Chu XX, 2021, ADV NEUR IN
[5]
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[6]
Dong B., 2021, ARXIV PREPRINT ARXIV
[7]
Dosovitskiy A, 2021, ICLR
[8]
Glorot X., 2010, P 13 INT C ART INT S, P249
[9]
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference
[J].
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021),
2021,
:12239-12249
[10]
Han K., 2021, P NIPS 21 P 35 INT C