共 50 条
- [21] Self-supervised Neural Audio-Visual Sound Source Localization via Probabilistic Spatial Modeling 2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 4848 - 4854
- [22] Noise-Tolerant Self-Supervised Learning for Audio-Visual Voice Activity Detection INTERSPEECH 2021, 2021, : 326 - 330
- [23] Self-Supervised Audio-Visual Representation Learning with Relaxed Cross-Modal Synchronicity THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 9723 - 9732
- [24] Induction Network: Audio-Visual Modality Gap-Bridging for Self-Supervised Sound Source Localization PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4042 - 4052
- [26] Learning Action Representations for Self-supervised Visual Exploration 2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 5873 - 5879
- [29] Multi-Modal Perception Attention Network with Self-Supervised Learning for Audio-Visual Speaker Tracking THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 1456 - 1463
- [30] Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 3884 - 3892