共 50 条
- [35] E3M: Zero-Shot Spatio-Temporal Video Grounding with Expectation-Maximization Multimodal Modulation COMPUTER VISION - ECCV 2024, PT LXXXIII, 2025, 15141 : 227 - 243
- [39] Grounding Visual Concepts for Zero-Shot Event Detection and Event Captioning KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 297 - 305