Linear-Complexity Self-Supervised Learning for Speech Processing

被引:0
作者
Zhang, Shucong [1 ]
Parcollet, Titouan [1 ]
van Dalen, Rogier [1 ]
Bhattacharya, Sourav [1 ]
机构
[1] Samsung AI Ctr Cambridge, Cambridge, England
来源
INTERSPEECH 2024 | 2024年
关键词
self-supervised learning; efficient models;
D O I
10.21437/Interspeech.2024-500
中图分类号
学科分类号
摘要
Self-supervised learning (SSL) models usually require weeks of pre-training with dozens of high-end GPUs. These models typically have a multi-headed self-attention (MHSA) context encoder. However, MHSA takes quadratic time and space in the input length, contributing to the high pre-training cost. Linear-complexity alternatives to MHSA have been proposed. For instance, in supervised training, the SummaryMixing model is the first to outperform MHSA across multiple speech processing tasks. However, these cheaper alternatives have not been explored for SSL yet. This paper studies a linear-complexity context encoder for SSL for the first time. With better or equivalent performance for the downstream tasks of the MP3S benchmark, SummaryMixing reduces the pre-training time and peak VRAM of wav2vec 2.0 model by 18% and by 23%, respectively, leading to the pre-training of a 155M wav2vec 2.0 model finished within one week with 4 Tesla A100 GPUs. Code(1) is available.
引用
收藏
页码:3480 / 3484
页数:5
相关论文
共 50 条
  • [41] Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?
    Chen, Sanyuan
    Wu, Yu
    Wang, Chengyi
    Liu, Shujie
    Chen, Zhuo
    Wang, Peidong
    Liu, Gang
    Li, Jinyu
    Wu, Jian
    Yu, Xiangzhan
    Wei, Furu
    INTERSPEECH 2022, 2022, : 3699 - 3703
  • [42] IMPROVING GENERALIZABILITY OF DISTILLED SELF-SUPERVISED SPEECH PROCESSING MODELS UNDER DISTORTED SETTINGS
    Huang, Kuan-Po
    Fu, Yu-Kuan
    Hsu, Tsu-Yuan
    Gutierrez, Fabian Ritter
    Wang, Fan-Lin
    Tseng, Liang-Hsuan
    Zhang, Yu
    Lee, Hung-Yi
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 1112 - 1119
  • [43] Self-Supervised Adversarial Variational Learning
    Ye, Fei
    Bors, Adrian. G.
    PATTERN RECOGNITION, 2024, 148
  • [44] Self-Supervised Learning of Robot Manipulation
    Tommy, Robin
    Krishnan, Athira R.
    2020 4TH INTERNATIONAL CONFERENCE ON AUTOMATION, CONTROL AND ROBOTS (ICACR 2020), 2020, : 22 - 25
  • [45] Self-supervised learning for outlier detection
    Diers, Jan
    Pigorsch, Christian
    STAT, 2021, 10 (01):
  • [46] Self-Supervised Learning for Recommender System
    Huang, Chao
    Wang, Xiang
    He, Xiangnan
    Yin, Dawei
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 3440 - 3443
  • [47] Synergistic Self-supervised and Quantization Learning
    Cao, Yun-Hao
    Sun, Peiqin
    Huang, Yechang
    Wu, Jianxin
    Zhou, Shuchang
    COMPUTER VISION - ECCV 2022, PT XXX, 2022, 13690 : 587 - 604
  • [48] COMBINING SELF-SUPERVISED AND SUPERVISED LEARNING WITH NOISY LABELS
    Zhang, Yongqi
    Zhang, Hui
    Yao, Quanming
    Wan, Jun
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 605 - 609
  • [49] Graph Self-Supervised Learning: A Survey
    Liu, Yixin
    Jin, Ming
    Pan, Shirui
    Zhou, Chuan
    Zheng, Yu
    Xia, Feng
    Yu, Philip S.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (06) : 5879 - 5900
  • [50] Self-Supervised Adversarial Imitation Learning
    Monteiro, Juarez
    Gavenski, Nathan
    Meneguzzi, Felipe
    Barros, Rodrigo C.
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,