共 85 条
- [1] Afouras Triantafyllos, 2020, COMPUTER VISIONECCV
- [3] Look, Listen and Learn [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 609 - 617
- [4] Aytar Y, 2016, ADV NEUR IN, V29
- [5] End-to-End Referring Video Object Segmentation with Multimodal Transformers [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4975 - 4985
- [6] One-Shot Video Object Segmentation [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5320 - 5329
- [7] Chen H., 2021, BRIT MACH VIS C BMVC, P1
- [8] Localizing Visual Sounds the Hard Way [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 16862 - 16871
- [9] Chen HL, 2020, INT CONF ACOUST SPEE, P721, DOI [10.1109/icassp40776.2020.9053174, 10.1109/ICASSP40776.2020.9053174]