共 40 条
[11]
Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:7181-7189
[12]
Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]
[13]
Visual Cluster Grounding for Image Captioning
[J].
IEEE TRANSACTIONS ON IMAGE PROCESSING,
2022, 31
:3920-3934
[14]
Karpathy A, 2014, ADV NEUR IN, V27
[15]
Karpathy A, 2015, PROC CVPR IEEE, P3128, DOI 10.1109/CVPR.2015.7298932
[16]
King DB, 2015, ACS SYM SER, V1214, P1, DOI 10.1021/bk-2015-1214.ch001
[17]
Kiros R, 2014, Arxiv, DOI arXiv:1411.2539
[19]
Stacked Cross Attention for Image-Text Matching
[J].
COMPUTER VISION - ECCV 2018, PT IV,
2018, 11208
:212-228