共 52 条
[1]
Anderson P, 2018, PROC CVPR IEEE, P6077, DOI [10.1109/CVPR.2018.00636, 10.1002/ett.70087]
[2]
Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching
[J].
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022),
2022,
:2483-2492
[3]
IMRAM: Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text Retrieval
[J].
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020),
2020,
:12652-12660
[4]
Probabilistic Embeddings for Cross-Modal Retrieval
[J].
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021,
2021,
:8411-8420
[5]
Chung JY, 2014, Arxiv, DOI [arXiv:1412.3555, 10.48550/arXiv.1412.3555, DOI 10.48550/ARXIV.1412.3555]
[6]
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[7]
Diao HW, 2021, AAAI CONF ARTIF INTE, V35, P1218
[8]
Faghri Fartash., 2018, BMVC, P12
[9]
Dynamic Fusion with Intra- and Inter-modality Attention Flow for Visual Question Answering
[J].
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019),
2019,
:6632-6641
[10]
Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:7181-7189