共 57 条
[1]
VQA: Visual Question Answering
[J].
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2015,
:2425-2433
[2]
Compositional Learning of Image-Text Query for Image Retrieval
[J].
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021),
2021,
:1139-1148
[3]
Deep Attention Neural Tensor Network for Visual Question Answering
[J].
COMPUTER VISION - ECCV 2018, PT XII,
2018, 11216
:21-37
[4]
Berg TL, 2010, LECT NOTES COMPUT SC, V6311, P663, DOI 10.1007/978-3-642-15549-9_48
[5]
Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
[6]
Leveraging Style and Content features for Text Conditioned Image Retrieval
[J].
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021,
2021,
:3973-3977
[7]
Chen YB, 2020, Img Proc Comp Vis Re, V12367, P136, DOI 10.1007/978-3-030-58542-6_9
[8]
Image Search with Text Feedback by Visiolinguistic Attention Learning
[J].
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2020,
:2998-3008
[9]
Cho K., 2014, P C EMP METH NAT LAN, P1724
[10]
Delmas G, 2022, Arxiv, DOI arXiv:2203.08101