共 50 条
- [1] Masked co-attention model for audio-visual event localization Applied Intelligence, 2024, 54 : 1691 - 1705
- [3] Audio-Visual Event Localization via Recursive Fusion by Joint Co-Attention 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 4012 - 4021
- [5] Dual Attention Matching for Audio-Visual Event Localization 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6301 - 6309
- [6] Learning Event-Specific Localization Preferences for Audio-Visual Event Localization PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3446 - 3454
- [7] Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 3884 - 3892
- [8] Audio-visual event detection based on mining of semantic audio-visual labels STORAGE AND RETRIEVAL METHODS AND APPLICATIONS FOR MULTIMEDIA 2004, 2004, 5307 : 292 - 299
- [9] Temporal Cross-Modal Attention for Audio-Visual Event Localization Seimitsu Kogaku Kaishi/Journal of the Japan Society for Precision Engineering, 2022, 88 (03): : 263 - 268