共 42 条
- [1] Bao H., 2021, arXiv
- [2] Bertasius G, 2021, PR MACH LEARN RES, V139
- [3] Improving Spatiotemporal Self-supervision by Deep Reinforcement Learning [J]. COMPUTER VISION - ECCV 2018, PT 15, 2018, 11219 : 797 - 814
- [4] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4724 - 4733
- [5] Chen PH, 2021, AAAI CONF ARTIF INTE, V35, P1045
- [6] Devlin J, 2019, Arxiv, DOI arXiv:1810.04805
- [7] Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
- [8] Learning Spatiotemporal Features with 3D Convolutional Networks [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4489 - 4497
- [9] Multiscale Vision Transformers [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 6804 - 6815
- [10] SlowFast Networks for Video Recognition [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6201 - 6210