共 44 条
[1]
Abnar S, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), P4190
[2]
ViViT: A Video Vision Transformer
[J].
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021),
2021,
:6816-6826
[3]
Bertasius G, 2021, PR MACH LEARN RES, V139
[4]
Bhandari K, 2020, IEEE IMAGE PROC, P266, DOI 10.1109/ICIP40778.2020.9191256
[5]
High accuracy optical flow estimation based on a theory for warping
[J].
COMPUTER VISION - ECCV 2004, PT 4,
2004, 2034
:25-36
[6]
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
[J].
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017),
2017,
:4724-4733
[7]
DPT: Deformable Patch-based Transformer for Visual Recognition
[J].
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021,
2021,
:2899-2907
[8]
Deformable Convolutional Networks
[J].
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2017,
:764-773
[9]
Dosovitskiy Alexey, 2021, P ICLR
[10]
Tangent Images for Mitigating Spherical Distortion
[J].
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020),
2020,
:12423-12431