Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?

被引:1
|
作者
Zaiem, Salah [1 ]
Kemiche, Youcef [2 ,3 ]
Parcollet, Titouan [4 ,5 ]
Essid, Slim [1 ]
Ravanelli, Mirco [6 ]
机构
[1] Inst Polytech Paris, Telecom Paris, LTCI, Paris, France
[2] Hi PARIS Engn Team, Paris, France
[3] Capgemini, Paris, France
[4] Samsung AI Ctr, Cambridge, England
[5] Univ Cambridge, Cambridge, England
[6] Concordia Univ, Univ Montreal, Mila Quebec AI Inst, Montreal, PQ, Canada
来源
INTERSPEECH 2023 | 2023年
关键词
self-supervised learning; representation learning;
D O I
10.21437/Interspeech.2023-1087
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Self-supervised learning (SSL) has recently allowed leveraging large datasets of unlabeled speech signals to reach impressive performance on speech tasks using only small amounts of annotated data. The high number of proposed approaches fostered the need and rise of extended benchmarks that evaluate their performance on a set of downstream tasks exploring various aspects of the speech signal. However, and while the number of considered tasks has been growing, most rely upon a single decoding architecture that maps the frozen SSL representations to the downstream labels. This work investigates the robustness of such benchmarking results to changes in the decoder architecture. Interestingly, it appears that varying the architecture of the downstream decoder leads to significant variations in the leaderboards of most tasks. Concerningly, our study reveals that benchmarking using limited decoders may cause a counterproductive increase in the sizes of the developed SSL models.
引用
收藏
页码:2873 / 2877
页数:5
相关论文
共 50 条
  • [31] MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets
    Ma, Ziyang
    Zheng, Zhisheng
    Tang, Changli
    Wang, Yujin
    Chen, Xie
    INTERSPEECH 2023, 2023, : 82 - 86
  • [32] Automatic Data Augmentation Selection and Parametrization in Contrastive Self-Supervised Speech Representation Learning
    Zaiem, Salah
    Parcollet, Titouan
    Essid, Slim
    INTERSPEECH 2022, 2022, : 669 - 673
  • [33] LARGE-SCALE SELF-SUPERVISED SPEECH REPRESENTATION LEARNING FOR AUTOMATIC SPEAKER VERIFICATION
    Chen, Zhengyang
    Chen, Sanyuan
    Wu, Yu
    Qian, Yao
    Wang, Chengyi
    Liu, Shujie
    Qian, Yanmin
    Zeng, Michael
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6147 - 6151
  • [34] SUPERB @ SLT 2022: CHALLENGE ON GENERALIZATION AND EFFICIENCY OF SELF-SUPERVISED SPEECH REPRESENTATION LEARNING
    Feng, Tzu-Hsun
    Dong, Annie
    Yeh, Ching-Feng
    Yang, Shu-Wen
    Lin, Tzu-Quan
    Shi, Jiatong
    Chang, Kai-Wei
    Huang, Zili
    Wu, Haibin
    Chang, Xuankai
    Watanabe, Shinji
    Mohamed, Abdelrahman
    Li, Shang-Wen
    Lee, Hung-Yi
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 1096 - 1103
  • [35] Spectral Salt-and-Pepper Patch Masking for Self-Supervised Speech Representation Learning
    Kim, June-Woo
    Chung, Hoon
    Jung, Ho-Young
    MATHEMATICS, 2023, 11 (15)
  • [36] UNIVERSAL PARALINGUISTIC SPEECH REPRESENTATIONS USING SELF-SUPERVISED CONFORMERS
    Shor, Joel
    Jansen, Aren
    Han, Wei
    Park, Daniel
    Zhang, Yu
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3169 - 3173
  • [37] Robust Self-Supervised Audio-Visual Speech Recognition
    Shi, Bowen
    Hsu, Wei-Ning
    Mohamed, Abdelrahman
    INTERSPEECH 2022, 2022, : 2118 - 2122
  • [38] INJECTING TEXT IN SELF-SUPERVISED SPEECH PRETRAINING
    Chen, Zhehuai
    Zhang, Yu
    Rosenberg, Andrew
    Ramabhadran, Bhuvana
    Wang, Gary
    Moreno, Pedro
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 251 - 258
  • [39] Self-supervised audiovisual representation learning for remote sensing data
    Heidler, Konrad
    Mou, Lichao
    Hu, Di
    Jin, Pu
    Li, Guangyao
    Gan, Chuang
    Wen, Ji-Rong
    Zhu, Xiao Xiang
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2023, 116
  • [40] Contrastive Self-supervised Representation Learning Using Synthetic Data
    She, Dong-Yu
    Xu, Kun
    INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING, 2021, 18 (04) : 556 - 567