共 50 条
[31]
Hierarchical Cross-Modal Graph Consistency Learning for Video-Text Retrieval
[J].
SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL,
2021,
:1114-1124
[32]
CMMT: Cross-Modal Meta-Transformer for Video-Text Retrieval
[J].
PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023,
2023,
:76-84
[34]
CLIP Based Multi-Event Representation Generation for Video-Text Retrieval
[J].
Jisuanji Yanjiu yu Fazhan/Computer Research and Development,
2023, 60 (09)
:2169-2179
[36]
Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval
[J].
ICMR '18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL,
2018,
:19-27
[37]
Fine-Grained Cross-Modal Contrast Learning for Video-Text Retrieval
[J].
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT V, ICIC 2024,
2024, 14866
:298-310
[38]
CONTEXT-AWARE HIERARCHICAL TRANSFORMER FOR FINE-GRAINED VIDEO-TEXT RETRIEVAL
[J].
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP,
2022,
:386-390
[39]
A Multi-interaction Model with Cross-Branch Feature Fusion for Video-Text Retrieval
[J].
NEURAL INFORMATION PROCESSING, ICONIP 2021, PT VI,
2022, 1517
:476-484
[40]
JM-CLIP: A JOINT MODAL SIMILARITY CONTRASTIVE LEARNING MODEL FOR VIDEO-TEXT RETRIEVAL
[J].
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024,
2024,
:3010-3014