共 31 条
[1]
IMRAM: Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text Retrieval
[J].
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020),
2020,
:12652-12660
[2]
VirTex: Learning Visual Representations from Textual Annotations
[J].
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021,
2021,
:11157-11168
[3]
Diao HW, 2021, AAAI CONF ARTIF INTE, V35, P1218
[4]
Faghri Fartash, 2017, ARXIV
[5]
Structured Multi-modal Feature Embedding and Alignment for Image-Sentence Retrieval
[J].
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021,
2021,
:5185-5193
[6]
Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:7181-7189
[7]
Deep Residual Learning for Image Recognition
[J].
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2016,
:770-778
[8]
Hua Y, 2019, IEEE INT CONF ELECTR, P252, DOI [10.1109/ICEIEC.2019.8784597, 10.1109/iceiec.2019.8784597]
[10]
Learning Semantic Concepts and Order for Image and Sentence Matching
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6163-6171