共 46 条
[1]
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6077-6086
[2]
[Anonymous], 2013, NeurIPS, DOI DOI 10.48550/ARXIV.1310.4546
[4]
Global Relation-Aware Attention Network for Image-Text Retrieval
[J].
PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21),
2021,
:19-28
[5]
CHANG SK, 1981, COMPUTER, V14, P13, DOI [10.1109/C-M.1981.220243, 10.1109/C-M.1981.220241]
[6]
IMRAM: Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text Retrieval
[J].
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020),
2020,
:12652-12660
[7]
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[9]
Linking Image and Text with 2-Way Nets
[J].
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017),
2017,
:1855-1865
[10]
Fartash F., 2018, BRIT MACH VIS C, P935