共 36 条
- [1] Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1708 - 1718
- [2] Revisiting the "Video" in Video-Language Understanding [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 2907 - 2917
- [3] Heilbron FC, 2015, PROC CVPR IEEE, P961, DOI 10.1109/CVPR.2015.7298698
- [4] Chang HS, 1999, IEEE T CIRC SYST VID, V9, P1269, DOI 10.1109/76.809161
- [5] Chang SF, 2003, 12TH INTERNATIONAL CONFERENCE ON IMAGE ANALYSIS AND PROCESSING, PROCEEDINGS, P494
- [6] Chen Dave Zhenyu, 2022, arXiv
- [7] Chen Y., 2023, ARXIV
- [8] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
- [9] Divakaran Ajay, 2002, P 2002 INT C IM PROC, V1, pI, DOI DOI 10.1109/ICIP.2002.1038180
- [10] Dosovitskiy A., 2020, ARXIV, V2010, P11929, DOI [10.48550/arXiv.2010.11929Focustolearnmore, DOI 10.48550/ARXIV.2010.11929FOCUSTOLEARNMORE]