共 44 条
[1]
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6077-6086
[2]
[Anonymous], 2018, 41 INT ACM SIGIR C R, DOI DOI 10.1145/3209978.3210003
[3]
VQA: Visual Question Answering
[J].
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2015,
:2425-2433
[4]
Devlin J., 2019, CORR, V1, P4171
[5]
Faghri Fartash, 2018, BRIT MACH VIS C
[6]
Frome A., 2013, Advances in neural information processing systems, V26, P2121
[7]
Dynamic Fusion with Intra- and Inter-modality Attention Flow for Visual Question Answering
[J].
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019),
2019,
:6632-6641
[8]
Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:7181-7189
[9]
Deep Residual Learning for Image Recognition
[J].
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2016,
:770-778
[10]
Hu Y., 2020, MATH PROBL ENG, V2020, P1