共 34 条
- [1] Akbari H, 2021, ADV NEUR IN
- [3] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
- [4] Video Emotion Recognition in the Wild Based on Fusion of Multimodal Features [J]. ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, : 494 - 500
- [5] Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 3884 - 3892
- [6] Dai WL, 2020, Arxiv, DOI arXiv:2009.09629
- [7] Dai WL, 2021, 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), P5305
- [8] Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
- [9] Gong Y, 2022, AAAI CONF ARTIF INTE, P10699
- [10] Han W, 2021, 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), P9180