共 24 条
[1]
Burkhardt F, 2005, INTERSPEECH, P1517, DOI DOI 10.21437/INTERSPEECH.2005-446
[3]
SPEAKER NORMALIZATION FOR SELF-SUPERVISED SPEECH EMOTION RECOGNITION
[J].
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP),
2022,
:7342-7346
[4]
Gillioz Anthony, 2020, 2020 15th Conference on Computer Science and Information Systems (FedCSIS), P179, DOI 10.15439/2020F20
[5]
Video Action Transformer Network
[J].
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019),
2019,
:244-253
[6]
Conformer: Convolution-augmented Transformer for Speech Recognition
[J].
INTERSPEECH 2020,
2020,
:5036-5040
[7]
Hinton G. E., 2013, P ICML WORKSH DEEP L
[8]
Jackson Philip, 2011, Surrey Audio-Visual Expressed Emotion (SAVEE) database
[9]
An Attention Pooling based Representation Learning Method for Speech Emotion Recognition
[J].
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES,
2018,
:3087-3091