共 50 条
- [31] CMMT: Cross-Modal Meta-Transformer for Video-Text Retrieval PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 76 - 84
- [32] CLIP Based Multi-Event Representation Generation for Video-Text Retrieval Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (09): : 2169 - 2179
- [33] Learning a Video-Text Joint Embedding using Korean Tagged Movie Clips 11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 1158 - 1160
- [34] Fine-Grained Cross-Modal Contrast Learning for Video-Text Retrieval ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT V, ICIC 2024, 2024, 14866 : 298 - 310
- [35] CONTEXT-AWARE HIERARCHICAL TRANSFORMER FOR FINE-GRAINED VIDEO-TEXT RETRIEVAL 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 386 - 390
- [36] VideoCLIP: A Cross-Attention Model for Fast Video-Text Retrieval Task with Image CLIP PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 29 - 33
- [37] Multi-Feature Graph Attention Network for Cross-Modal Video-Text Retrieval PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 135 - 143
- [38] MILES: Visual BERT Pre-training with Injected Language Semantics for Video-Text Retrieval COMPUTER VISION - ECCV 2022, PT XXXV, 2022, 13695 : 691 - 708
- [40] INTEGRATED MODALITIES AND MULTI-LEVEL GRANULARITY: TOWARDS A UNIFIED VIDEO-TEXT RETRIEVAL FRAMEWORK 2021 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2021,