共 78 条
[1]
Abacha A. B., 2019, P CLEF WORK NOTES
[2]
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6077-6086
[3]
[Anonymous], 2017, P MEDIAEVAL
[4]
[Anonymous], P 3 INT C LEARNING R
[5]
[Anonymous], Simple baseline for visual question answering
[6]
[Anonymous], 2018, ADV NEURAL INFORM PR
[7]
VQA: Visual Question Answering
[J].
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2015,
:2425-2433
[9]
MUTAN: Multimodal Tucker Fusion for Visual Question Answering
[J].
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2017,
:2631-2639
[10]
Benjamin B., 2018, P MEDIAEVAL SOPH ANT