共 46 条
[2]
Akbari S, 2021, ADV NEUR IN, V34
[3]
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
[J].
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021),
2021,
:1708-1718
[5]
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[6]
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[7]
Fang Han, 2021, ARXIV210611097
[8]
Fu Tsu-Jui, 2021, ARXIV211112681
[9]
Bridging Video-text Retrieval with Multiple Choice Questions
[J].
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022),
2022,
:16146-16155
[10]
Gu Jiaxi, 2022, ARXIV220206767