共 38 条
- [1] Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4971 - 4980
- [2] Neural Module Networks [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 39 - 48
- [3] Banerjee S., 2005, P ACL WORKSH INTR EX, P65, DOI DOI 10.3115/1626355.1626389
- [4] Chen Lingjiao, 2023, How is chatgpt's behavior changing over time?
- [5] Chen M., 2021, arXiv, DOI 10.48550/ARXIV.2107.03374
- [6] Chiang W.L., 2023, Vicuna: An open -source chatbot impressing gpt-4 with 90%* chatgpt quality
- [7] Dai Wenliang, 2023, In-structblip: Towards general-purpose vision-language models with instruction tuning
- [8] Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6325 - 6334
- [9] Visual Programming: Compositional visual reasoning without training [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14953 - 14962
- [10] Hu Edward J, 2022, INT C LEARNING REPR