共 66 条
[1]
Anderson P, 2018, PROC CVPR IEEE, P6077, DOI [10.1002/ett.70087, 10.1109/CVPR.2018.00636]
[2]
Neural Module Networks
[J].
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2016,
:39-48
[3]
VQA: Visual Question Answering
[J].
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2015,
:2425-2433
[4]
Ben-Younes H, 2019, AAAI CONF ARTIF INTE, P8102
[5]
MUTAN: Multimodal Tucker Fusion for Visual Question Answering
[J].
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2017,
:2631-2639
[6]
MUREL: Multimodal Relational Reasoning for Visual Question Answering
[J].
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019),
2019,
:1989-1998
[7]
Counterfactual Samples Synthesizing for Robust Visual Question Answering
[J].
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020),
2020,
:10797-10806
[8]
Chen Xinlei., 2015, CoRR abs/1504.00325
[9]
Chung Junyoung., 2014, Corr
[10]
Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6087-6096