Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?

被引：1

作者：

Zaiem, Salah ^{[1
]}

Kemiche, Youcef ^{[2
,3
]}

Parcollet, Titouan ^{[4
,5
]}

Essid, Slim ^{[1
]}

Ravanelli, Mirco ^{[6
]}

机构：

[1] Inst Polytech Paris, Telecom Paris, LTCI, Paris, France

[2] Hi PARIS Engn Team, Paris, France

[3] Capgemini, Paris, France

[4] Samsung AI Ctr, Cambridge, England

[5] Univ Cambridge, Cambridge, England

[6] Concordia Univ, Univ Montreal, Mila Quebec AI Inst, Montreal, PQ, Canada

来源：

INTERSPEECH 2023 | 2023年

关键词：

self-supervised learning; representation learning;

D O I：

10.21437/Interspeech.2023-1087

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Self-supervised learning (SSL) has recently allowed leveraging large datasets of unlabeled speech signals to reach impressive performance on speech tasks using only small amounts of annotated data. The high number of proposed approaches fostered the need and rise of extended benchmarks that evaluate their performance on a set of downstream tasks exploring various aspects of the speech signal. However, and while the number of considered tasks has been growing, most rely upon a single decoding architecture that maps the frozen SSL representations to the downstream labels. This work investigates the robustness of such benchmarking results to changes in the decoder architecture. Interestingly, it appears that varying the architecture of the downstream decoder leads to significant variations in the leaderboards of most tasks. Concerningly, our study reveals that benchmarking using limited decoders may cause a counterproductive increase in the sizes of the developed SSL models.

引用

页码：2873 / 2877

页数：5

共 50 条

[21] The Efficacy of Self-Supervised Speech Models as Audio Representations
Wu, Tung-Yu
Hsu, Tsu-Yuan
Li, Chen-An
Lin, Tzu-Han
Lee, Hung-yi
HEAR: HOLISTIC EVALUATION OF AUDIO REPRESENTATIONS, VOL 166, 2021, 166 : 90 - 110
[22] A survey on self-supervised methods for visual representation learning
Uelwer, Tobias
Robine, Jan
Wagner, Stefan Sylvius
Hoeftmann, Marc
Upschulte, Eric
Konietzny, Sebastian
Behrendt, Maike
Harmeling, Stefan
MACHINE LEARNING, 2025, 114 (04)
[23] Video Face Clustering with Self-Supervised Representation Learning
Sharma V.
Tapaswi M.
Saquib Sarfraz M.
Stiefelhagen R.
IEEE Transactions on Biometrics, Behavior, and Identity Science, 2020, 2 (02): : 145 - 157
[24] Random Field Augmentations for Self-Supervised Representation Learning
Mansfield, Philip Andrew
Afkanpour, Arash
Morningstar, Warren Richard
Singhal, Karan
NEURIPS WORKSHOP ON SYMMETRY AND GEOMETRY IN NEURAL REPRESENTATIONS, 2023, 228 : 292 - 302
[25] SELF-SUPERVISED REPRESENTATION LEARNING FROM ELECTROENCEPHALOGRAPHY SIGNALS
Banville, Hubert
Albuquerque, Isabela
Hyvarinen, Aapo
Moffat, Graeme
Engemann, Denis-Alexander
Gramfort, Alexandre
2019 IEEE 29TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2019,
[26] Functional Knowledge Transfer with Self-supervised Representation Learning
Chhipa, Prakash Chandra
Chopra, Muskaan
Mengi, Gopal
Gupta, Varun
Upadhyay, Richa
Chippa, Meenakshi Subhash
De, Kanjar
Saini, Rajkumar
Uchida, Seiichi
Liwicki, Marcus
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 3339 - 3343
[27] Simple Self-supervised Multiplex Graph Representation Learning
Mo, Yujie
Chen, Yuhuan
Peng, Liang
Shi, Xiaoshuang
Zhu, Xiaofeng
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3301 - 3309
[28] Self-Supervised Representation Learning for Document Image Classification
Siddiqui, Shoaib Ahmed
Dengel, Andreas
Ahmed, Sheraz
IEEE ACCESS, 2021, 9 : 164358 - 164367
[29] Self-supervised Representation Learning Using 360° Data
Li, Junnan
Liu, Jianquan
Wong, Yongkang
Nishimura, Shoji
Kankanhalli, Mohan S.
PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 998 - 1006
[30] Self-supervised representation learning for surgical activity recognition
Daniel Paysan
Luis Haug
Michael Bajka
Markus Oelhafen
Joachim M. Buhmann
International Journal of Computer Assisted Radiology and Surgery, 2021, 16 : 2037 - 2044

← 1 2 3 4 5 →