共 215 条
- [81] He Z, 2021, AAAI C ARTIFICIAL IN, P5931
- [82] Herzig R, 2018, ADV NEURAL INFORM PR, P7211
- [83] Hessel M, 2019, AAAI CONF ARTIF INTE, P3796
- [84] Hill F, 2021, ICLR 2021
- [85] Deep Multimodal Clustering for Unsupervised Audiovisual Learning [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9240 - 9249
- [86] Hu H., 2020, ICLR 2020
- [87] Learning to Reason: End-to-End Module Networks for Visual Question Answering [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 804 - 813
- [88] Le H, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P5612
- [90] Direct speech-to-speech translation with a sequence-to-sequence model [J]. INTERSPEECH 2019, 2019, : 1123 - 1127