共 50 条
- [41] EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
- [43] COPA : Efficient Vision-Language Pre-training through Collaborative Object- and Patch-Text Alignment PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4480 - 4491
- [44] IDEA: Increasing Text Diversity via Online Multi-Label Recognition for Vision-Language Pre-training PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4573 - 4583
- [45] Automated Bridge Inspection Image Interpretation Based on Vision-Language Pre-Training COMPUTING IN CIVIL ENGINEERING 2023-DATA, SENSING, AND ANALYTICS, 2024, : 1 - 8
- [46] MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training Model 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23262 - 23271
- [48] Leveraging per Image-Token Consistency for Vision-Language Pre-training 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19155 - 19164
- [49] MAKE: Vision-Language Pre-training based Product Retrieval in Taobao Search COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 356 - 360
- [50] Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5120 - 5131