共 55 条
[1]
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6077-6086
[2]
[Anonymous], 2018, IEEE T NEUR NET LEAR, DOI DOI 10.1109/TNNLS.2018.2817340
[3]
[Anonymous], 2016, ICLR
[4]
VQA: Visual Question Answering
[J].
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2015,
:2425-2433
[5]
MUTAN: Multimodal Tucker Fusion for Visual Question Answering
[J].
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2017,
:2631-2639
[6]
Bengio Y., 2009, P 26 ANN INT C MACH, P41, DOI DOI 10.1145/1553374.1553380
[7]
Chen K., 2015, Abc-cnn: An attention based convolutional neural network for visual question answering
[8]
Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs
[J].
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020),
2020,
:9959-9968
[10]
Multiple Interaction Learning with Question-Type Prior Knowledge for Constraining Answer Search Space in Visual Question Answering
[J].
COMPUTER VISION - ECCV 2020 WORKSHOPS, PT II,
2020, 12536
:496-510