共 57 条
[1]
Alayrac Jean-Baptiste, 2022, P NEURIPS NEW ORL
[2]
[Anonymous], 2022, P IEEE CVF C COMP VI, DOI DOI 10.1109/ICPSASIA55496.2022.9949880
[3]
[Anonymous], 2022, P IEEE CVF C COMP VI, DOI DOI 10.1109/SPIES55999.2022.10082039
[4]
ViViT: A Video Vision Transformer
[J].
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021),
2021,
:6816-6826
[5]
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
[J].
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021),
2021,
:1708-1718
[6]
STORM-GAN: Spatio-Temporal Meta-GAN for Cross-City Estimation of Human Mobility Responses to COVID-
[J].
2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM),
2022,
:1-10
[7]
Battaglia P. W., 2018, ARXIV
[8]
Beltagy I., 2020, LONGFORMER LONG DOCU, DOI DOI 10.48550/ARXIV.2004.05150
[9]
Bertasius G, 2021, PR MACH LEARN RES, V139
[10]
Revisiting the "Video" in Video-Language Understanding
[J].
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022),
2022,
:2907-2917