共 76 条
[1]
Abbas A, 2023, Arxiv, DOI [arXiv:2303.09540, 10.48550/arXiv.2303.09540]
[2]
Achiam J., 2023, GPT 4 TECHNICAL REPO, DOI DOI 10.48550/ARXIV.2303.08774
[3]
[Anonymous], 2024, OpenAI.gpt-4-vision-preview
[4]
VQA: Visual Question Answering
[J].
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2015,
:2425-2433
[5]
Bai JZ, 2023, Arxiv, DOI [arXiv:2308.12966, 10.48550/arXiv.2308.12966]
[6]
Beagle: Automated Extraction and Interpretation of Visualizations from the Web
[J].
PROCEEDINGS OF THE 2018 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI 2018),
2018,
[8]
Brown TB, 2020, ADV NEUR IN, V33
[9]
Card S.K., 1999, Readings in Information Visualization: Using Vision to Think
[10]
Honeybee: Locality-enhanced Projector for Multimodal LLM
[J].
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2024,
:13817-13827