共 35 条
[31]
Unsupervised Feature Learning via Non-Parametric Instance Discrimination
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:3733-3742
[32]
DyViSE: Dynamic Vision-Guided Speaker Embedding for Audio-Visual Speaker Diarization
[J].
2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP),
2022,
[33]
SUPERB: Speech processing Universal PERformance Benchmark
[J].
INTERSPEECH 2021,
2021,
:1194-1198
[34]
Yang Y., 2022, ARXIV221100325
[35]
E2E-based Multi-task Learning Approach to Joint Speech and Accent Recognition
[J].
INTERSPEECH 2021,
2021,
:1519-1523