共 31 条
[1]
Deep Lip Reading: a comparison of models and an online application
[J].
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES,
2018,
:3514-3518
[2]
Alcazar J. L., 2020, CVPR
[3]
[Anonymous], 2017, Neurocomputing
[4]
Look, Listen and Learn
[J].
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2017,
:609-617
[5]
Aytar Y, 2016, ADV NEUR IN, V29
[6]
Berghi D, 2020, 2020 IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES WORKSHOPS (VRW 2020), P667, DOI [10.1109/VRW50115.2020.00-91, 10.1109/VRW50115.2020.00184]
[7]
Chakravarty P., 2015, ACM INT C MULTIMODAL
[8]
Chung Joon Son, 2019, Naver at ActivityNet challenge 2019-Task B active speaker detection (AVA)
[9]
Cutler R, 2000, 2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, P1589, DOI 10.1109/ICME.2000.871073
[10]
Looking to Listen at the Cocktail Party: A Speaker-Independent Audio-Visual Model for Speech Separation
[J].
ACM TRANSACTIONS ON GRAPHICS,
2018, 37 (04)