共 32 条
- [1] My lips are concealed: Audio-visual speech enhancement through obstructions [J]. INTERSPEECH 2019, 2019, : 4295 - 4299
- [2] Afouras T, 2018, INTERSPEECH, P3244
- [5] Lip Reading Sentences in the Wild [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3444 - 3450
- [6] Delcroix M, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), P5554, DOI 10.1109/ICASSP.2018.8462661
- [7] Looking to Listen at the Cocktail Party: A Speaker-Independent Audio-Visual Model for Speech Separation [J]. ACM TRANSACTIONS ON GRAPHICS, 2018, 37 (04):
- [8] VISUALVOICE: Audio-Visual Speech Separation with Cross-Modal Consistency [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 15490 - 15500
- [9] Hershey JR, 2016, INT CONF ACOUST SPEE, P31, DOI 10.1109/ICASSP.2016.7471631
- [10] Single-Channel Multi-Speaker Separation using Deep Clustering [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 545 - 549