共 50 条
- [21] Multi-modal temporal asynchronicity modeling by product HMMs for robust audio-visual speech recognition FOURTH IEEE INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES, PROCEEDINGS, 2002, : 305 - 309
- [22] Single-modal Incremental Terrain Clustering from Self-Supervised Audio-Visual Feature Learning 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 9399 - 9406
- [23] Self-Supervised Audio-Visual Feature Learning for Single-Modal Incremental Terrain Type Clustering IEEE ACCESS, 2021, 9 : 64346 - 64357
- [24] INVESTIGATING SELF-SUPERVISED LEARNING FOR SPEECH ENHANCEMENT AND SEPARATION 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6837 - 6841
- [25] TOWARDS POSE-INVARIANT AUDIO-VISUAL SPEECH ENHANCEMENT IN THE WILD FOR NEXT-GENERATION MULTI-MODAL HEARING AIDS 2023 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW, 2023,
- [26] MTCAM: A Novel Weakly-Supervised Audio-Visual Saliency Prediction Model With Multi-Modal Transformer IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (02): : 1756 - 1771
- [27] VISUALVOICE: Audio-Visual Speech Separation with Cross-Modal Consistency 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 15490 - 15500
- [30] Self-Supervised Audio-Visual Representation Learning for in-the-wild Videos 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 5671 - 5672