共 38 条
- [11] Learning to Reason: End-to-End Module Networks for Visual Question Answering [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 804 - 813
- [12] Hudson Drew, 2019, ADV NEURAL INFORM PR, V32
- [13] Hudson Drew A, 2019, CONFER ENCE COMPUTE
- [14] Hudson Drew A, 2018, INT C LEARNING REPR
- [15] Jain S, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P3543
- [16] Jia Robin, 2017, P 2017 C EMP METH NA, P2021, DOI DOI 10.18653/V1/D17-1215
- [17] CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1988 - 1997
- [20] Li Junnan, 2023, ICML