共 31 条
[21]
Panayotov Vassil, 2015, ICASSP 2015
[22]
Pepino L., 2021, INTERSPEECH
[23]
Rivière M, 2020, INT CONF ACOUST SPEE, P7414, DOI [10.1109/icassp40776.2020.9054548, 10.1109/ICASSP40776.2020.9054548]
[24]
wav2vec: Unsupervised Pre-training for Speech Recognition
[J].
INTERSPEECH 2019,
2019,
:3465-3469
[25]
WISE: Word-Level Interaction-Based Multimodal Fusion for Speech Emotion Recognition
[J].
INTERSPEECH 2020,
2020,
:369-373
[26]
van den Oord Aaron, 2018, CoRR, DOI 10.48550/arxiv.1807.03748
[27]
Vlasenko Bogdan, 2007, COMBINING FRAME TURN, P1
[28]
Wang Jianyou, 2020, ICASSP
[29]
Yang Shu-wen, 2021, ARXIV210501051
[30]
Yoon Seunghyun, 2018, SLT