共 31 条
- [21] Panayotov Vassil, 2015, ICASSP 2015
- [22] Pepino L., 2021, INTERSPEECH
- [23] Rivière M, 2020, INT CONF ACOUST SPEE, P7414, DOI [10.1109/icassp40776.2020.9054548, 10.1109/ICASSP40776.2020.9054548]
- [24] wav2vec: Unsupervised Pre-training for Speech Recognition [J]. INTERSPEECH 2019, 2019, : 3465 - 3469
- [25] WISE: Word-Level Interaction-Based Multimodal Fusion for Speech Emotion Recognition [J]. INTERSPEECH 2020, 2020, : 369 - 373
- [26] van den Oord Aaron, 2018, arXiv
- [27] Vlasenko Bogdan, 2007, COMBINING FRAME TURN, P1
- [28] Wang Jianyou, 2020, ICASSP
- [29] Yang Shu-wen, 2021, ARXIV210501051
- [30] Yoon Seunghyun, 2018, SLT