共 60 条
[1]
ViViT: A Video Vision Transformer
[J].
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021),
2021,
:6816-6826
[2]
Asim M, 2018, 2018 COL VIS COMP S, P1, DOI DOI 10.1109/CVCS.2018.8496473
[4]
Bertasius G, 2021, PR MACH LEARN RES, V139
[5]
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
[J].
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017),
2017,
:4724-4733
[8]
Learning Spatiotemporal Features with 3D Convolutional Networks
[J].
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2015,
:4489-4497
[9]
Omni-Sourced Webly-Supervised Learning for Video Recognition
[J].
COMPUTER VISION - ECCV 2020, PT XV,
2020, 12360
:670-688
[10]
X3D: Expanding Architectures for Efficient Video Recognition
[J].
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2020,
:200-210