共 57 条
[1]
Emotion Recognition in Speech using Cross-Modal Transfer in the Wild
[J].
PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18),
2018,
:292-301
[2]
Look, Listen and Learn
[J].
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2017,
:609-617
[3]
Aytar Y., 2016, Advances in neural information processing systems, V29, P892
[4]
Barzelay Zohar, 2007, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), P1, DOI 10.1109/CVPR.2007.383344
[5]
Brand M, 1999, COMP GRAPH, P21, DOI 10.1145/311535.311537
[6]
Bregler C., 1997, Video rewrite: driving visual speech with audio. Proceedings of the 24th annual conference on computer graphics and interactive techniques
[7]
Cao Zhe, 2018, arXiv
[8]
Monoaural Audio Source Separation Using Deep Convolutional Neural Networks
[J].
LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION (LVA/ICA 2017),
2017, 10169
:258-266
[9]
Lip Reading Sentences in the Wild
[J].
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017),
2017,
:3444-3450