共 79 条
[11]
X3D: Expanding Architectures for Efficient Video Recognition
[J].
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2020,
:200-210
[12]
Convolutional Two-Stream Network Fusion for Video Action Recognition
[J].
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2016,
:1933-1941
[13]
Finn C, 2017, PR MACH LEARN RES, V70
[14]
Depth Guided Adaptive Meta-Fusion Network for Few-shot Video Recognition
[J].
MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA,
2020,
:1142-1151
[15]
Embodied One-Shot Video Recognition: Learning from Actions of a Virtual Embodied Agent
[J].
PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19),
2019,
:411-419
[16]
The "something something" video database for learning and evaluating visual common sense
[J].
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2017,
:5843-5851
[17]
Temporal Alignment Networks for Long-term Video
[J].
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022),
2022,
:2896-2906
[18]
Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6546-6555
[19]
Deep Residual Learning for Image Recognition
[J].
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2016,
:770-778
[20]
Hu J, 2018, PROC CVPR IEEE, P7132, DOI [10.1109/TPAMI.2019.2913372, 10.1109/CVPR.2018.00745]