Linear-Complexity Self-Supervised Learning for Speech Processing

被引：0

作者：

Zhang, Shucong ^{[1
]}

Parcollet, Titouan ^{[1
]}

van Dalen, Rogier ^{[1
]}

Bhattacharya, Sourav ^{[1
]}

机构：

[1] Samsung AI Ctr Cambridge, Cambridge, England

来源：

INTERSPEECH 2024 | 2024年

关键词：

self-supervised learning; efficient models;

D O I：

10.21437/Interspeech.2024-500

中图分类号：

学科分类号：

摘要：

Self-supervised learning (SSL) models usually require weeks of pre-training with dozens of high-end GPUs. These models typically have a multi-headed self-attention (MHSA) context encoder. However, MHSA takes quadratic time and space in the input length, contributing to the high pre-training cost. Linear-complexity alternatives to MHSA have been proposed. For instance, in supervised training, the SummaryMixing model is the first to outperform MHSA across multiple speech processing tasks. However, these cheaper alternatives have not been explored for SSL yet. This paper studies a linear-complexity context encoder for SSL for the first time. With better or equivalent performance for the downstream tasks of the MP3S benchmark, SummaryMixing reduces the pre-training time and peak VRAM of wav2vec 2.0 model by 18% and by 23%, respectively, leading to the pre-training of a 155M wav2vec 2.0 model finished within one week with 4 Tesla A100 GPUs. Code(1) is available.

引用

页码：3480 / 3484

页数：5

共 50 条

[41] Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?
Chen, Sanyuan
Wu, Yu
Wang, Chengyi
Liu, Shujie
Chen, Zhuo
Wang, Peidong
Liu, Gang
Li, Jinyu
Wu, Jian
Yu, Xiangzhan
Wei, Furu
INTERSPEECH 2022, 2022, : 3699 - 3703
[42] IMPROVING GENERALIZABILITY OF DISTILLED SELF-SUPERVISED SPEECH PROCESSING MODELS UNDER DISTORTED SETTINGS
Huang, Kuan-Po
Fu, Yu-Kuan
Hsu, Tsu-Yuan
Gutierrez, Fabian Ritter
Wang, Fan-Lin
Tseng, Liang-Hsuan
Zhang, Yu
Lee, Hung-Yi
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 1112 - 1119
[43] Self-Supervised Adversarial Variational Learning
Ye, Fei
Bors, Adrian. G.
PATTERN RECOGNITION, 2024, 148
[44] Self-Supervised Learning of Robot Manipulation
Tommy, Robin
Krishnan, Athira R.
2020 4TH INTERNATIONAL CONFERENCE ON AUTOMATION, CONTROL AND ROBOTS (ICACR 2020), 2020, : 22 - 25
[45] Self-supervised learning for outlier detection
Diers, Jan
Pigorsch, Christian
STAT, 2021, 10 (01):
[46] Self-Supervised Learning for Recommender System
Huang, Chao
Wang, Xiang
He, Xiangnan
Yin, Dawei
PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 3440 - 3443
[47] Synergistic Self-supervised and Quantization Learning
Cao, Yun-Hao
Sun, Peiqin
Huang, Yechang
Wu, Jianxin
Zhou, Shuchang
COMPUTER VISION - ECCV 2022, PT XXX, 2022, 13690 : 587 - 604
[48] COMBINING SELF-SUPERVISED AND SUPERVISED LEARNING WITH NOISY LABELS
Zhang, Yongqi
Zhang, Hui
Yao, Quanming
Wan, Jun
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 605 - 609
[49] Graph Self-Supervised Learning: A Survey
Liu, Yixin
Jin, Ming
Pan, Shirui
Zhou, Chuan
Zheng, Yu
Xia, Feng
Yu, Philip S.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (06) : 5879 - 5900
[50] Self-Supervised Adversarial Imitation Learning
Monteiro, Juarez
Gavenski, Nathan
Meneguzzi, Felipe
Barros, Rodrigo C.
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,

← 1 2 3 4 5 →