共 74 条
[11]
Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, 10.48550/arXiv.2010.11929]
[13]
Metric Learning Based Feature Representation with Gated Fusion Model for Speech Emotion Recognition
[J].
INTERSPEECH 2021,
2021,
:4503-4507
[15]
Gómez-Zaragozá L, 2024, Arxiv, DOI arXiv:2403.02167
[16]
Guizzo E, 2020, INT CONF ACOUST SPEE, P6489, DOI [10.1109/ICASSP40776.2020.9053727, 10.1109/icassp40776.2020.9053727]
[17]
CMT: Convolutional Neural Networks Meet Vision Transformers
[J].
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2022,
:12165-12175
[18]
REPRESENTATION LEARNING WITH SPECTRO-TEMPORAL-CHANNEL ATTENTION FOR SPEECH EMOTION RECOGNITION
[J].
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021),
2021,
:6304-6308
[19]
Gupta B., 2019, Emerg. Sci. J., V3, P23
[20]
Deep Residual Learning for Image Recognition
[J].
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2016,
:770-778