共 50 条
- [21] CAD - Contextual Multi-modal Alignment for Dynamic AVQA [J]. 2024 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION, WACV 2024, 2024, : 7236 - 7248
- [22] Hierarchical multi-modal video summarization with dynamic sampling [J]. IET IMAGE PROCESSING, 2024, 18 (14) : 4577 - 4588
- [23] Multi-Modal Dynamic Graph Transformer for Visual Grounding [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15513 - 15522
- [25] Multi-modal Language Models for Lecture Video Retrieval [J]. PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 1081 - 1084
- [26] The role of mental models in a multi-modal image search [J]. ASIST 2001: PROCEEDINGS OF THE 64TH ASIST ANNUAL MEETING, VOL 38, 2001, 2001, 38 : 52 - 57
- [28] Visual Hallucinations of Multi-modal Large Language Models [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 9614 - 9631
- [29] Enhancing Image Classification Models with Multi-modal Biomarkers [J]. MEDICAL IMAGING 2011: COMPUTER-AIDED DIAGNOSIS, 2011, 7963
- [30] Emotional Models for Multi-modal Communication of Robot Partners [J]. 2013 IEEE INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2013,