共 66 条
- [31] Li Z., 2021, Advances in Neural Information Processing Systems, P13165
- [32] TSM: Temporal Shift Module for Efficient Video Understanding [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 7082 - 7092
- [33] TAM: Temporal Adaptive Module for Video Recognition [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13688 - 13698
- [34] Liu ZY, 2020, AAAI CONF ARTIF INTE, V34, P11669
- [35] Loshchilov Ilya, 2018, P 7 INT C LEARN REPR
- [36] Mnih V, 2013, ARXIV
- [37] Neimark D, 2021, IEEE INT CONF COMP V, P3156, DOI [arXiv:2102.00719, 10.1109/ICCVW54120.2021.00355]
- [38] Patrick Mandela, 2021, ADV NEUR IN, V34
- [39] Spatiotemporal Contrastive Video Representation Learning [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 6960 - 6970
- [40] Radford A., 2018, Improving language understanding by generative pre-training