共 68 条
- [61] Touvron H., 2023, arXiv, DOI [arXiv:2302.13971, 10.48550/arXiv.2302.13971]
- [62] Tran K. Q., 2021, P 35 PACIFIC ASIA C, P683
- [63] Tran KV, 2023, Arxiv, DOI arXiv:2310.18046
- [64] Wang W., 2024, Adv. Neural Inf. Process. Syst., V36
- [67] VinVL: Revisiting Visual Representations in Vision-Language Models [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 5575 - 5584
- [68] Visual7W: Grounded Question Answering in Images [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4995 - 5004