共 36 条
[1]
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
[J].
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021),
2021,
:1708-1718
[2]
Revisiting the "Video" in Video-Language Understanding
[J].
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022),
2022,
:2907-2917
[3]
Heilbron FC, 2015, PROC CVPR IEEE, P961, DOI 10.1109/CVPR.2015.7298698
[4]
Chang HS, 1999, IEEE T CIRC SYST VID, V9, P1269, DOI 10.1109/76.809161
[5]
Chang SF, 2003, 12TH INTERNATIONAL CONFERENCE ON IMAGE ANALYSIS AND PROCESSING, PROCEEDINGS, P494
[6]
Chen Dave Zhenyu, 2022, arXiv
[7]
Chen Y., 2023, ARXIV
[8]
Cheng Xing, 2021, CoRR
[9]
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[10]
Divakaran Ajay, 2002, P 2002 INT C IM PROC, V1, pI, DOI DOI 10.1109/ICIP.2002.1038180