共 50 条
[22]
A ROBUST AUDIO-VISUAL SPEECH ENHANCEMENT MODEL
[J].
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING,
2020,
:7529-7533
[24]
BenAV: a Bengali Audio-Visual Corpus for Visual Speech Recognition
[J].
NEURAL INFORMATION PROCESSING, ICONIP 2021, PT II,
2021, 13109
:526-535
[25]
Multi-Attention Audio-Visual Fusion Network for Audio Spatialization
[J].
PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21),
2021,
:394-401
[26]
Persian Music Source Separation in Audio-Visual Data Using Deep Learning
[J].
2020 6TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS),
2020,
[27]
Improving speech embedding using crossmodal transfer learning with audio-visual data
[J].
Multimedia Tools and Applications,
2019, 78
:15681-15704
[30]
Audio-Visual Embedding for Cross-Modal Music Video Retrieval through Supervised Deep CCA
[J].
2018 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2018),
2018,
:143-150