SKILL: SIMILARITY-AWARE KNOWLEDGE DISTILLATION FOR SPEECH SELF-SUPERVISED LEARNING

被引:0
|
作者
Zampierin, Luca [1 ,2 ]
Hacene, Ghouthi Boukli [1 ,5 ]
Nguyen, Bac [1 ]
Ravanelli, Mirco [3 ,4 ,5 ]
机构
[1] Sony Europe BV, Stuttgart Lab 1, Stuttgart, Germany
[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
[3] Concordia Univ, Montreal, PQ, Canada
[4] Univ Montreal, Montreal, PQ, Canada
[5] Mila Quebec AI Inst, Montreal, PQ, Canada
来源
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024 | 2024年
关键词
Model compression; self-supervised learning; knowledge distillation;
D O I
10.1109/ICASSPW62465.2024.10626978
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Self-supervised learning (SSL) has achieved remarkable success across various speech-processing tasks. To enhance its efficiency, previous works often leverage the use of compression techniques. A notable recent attempt is DPHuBERT, which applies joint knowledge distillation (KD) and structured pruning to learn a significantly smaller SSL model. In this paper, we contribute to this research domain by introducing SKILL, a novel method that conducts distillation across groups of layers instead of distilling individual arbitrarily selected layers within the teacher network. The identification of the layers to distill is achieved through a hierarchical clustering procedure applied to layer similarity measures. Extensive experiments demonstrate that our distilled version ofWavLM Base+ not only outperforms DPHuBERT but also achieves state-of-the-art results in the 30M parameters model class across several SUPERB tasks.
引用
收藏
页码:675 / 679
页数:5
相关论文
共 50 条
  • [41] Learning the Relation Between Similarity Loss and Clustering Loss in Self-Supervised Learning
    Ge, Jidong
    Liu, Yuxiang
    Gui, Jie
    Fang, Lanting
    Lin, Ming
    Kwok, James Tin-Yau
    Huang, Liguo
    Luo, Bin
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 3442 - 3454
  • [42] Graph Knowledge Structure for Attentional Knowledge Tracing With Self-Supervised Learning
    Liu, Zhaohui
    Liu, Sainan
    Gu, Weifeng
    IEEE ACCESS, 2025, 13 : 10933 - 10943
  • [43] Self-supervised Context-aware Style Representation for Expressive Speech Synthesis
    Wu, Yihan
    Wang, Xi
    Zhang, Shaofei
    He, Lei
    Song, Ruihua
    Nie, Jian-Yun
    INTERSPEECH 2022, 2022, : 5503 - 5507
  • [44] Self-supervised learning with ensemble representations
    Han, Kyoungmin
    Lee, Minsik
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 143
  • [45] Cross Pixel Optical-Flow Similarity for Self-supervised Learning
    Mahendran, Aravindh
    Thewlis, James
    Vedaldi, Andrea
    COMPUTER VISION - ACCV 2018, PT V, 2019, 11365 : 99 - 116
  • [46] Pose-Aware Self-supervised Learning with Viewpoint Trajectory Regularization
    Wang, Jiayun
    Chen, Yubei
    Yu, Stella X.
    COMPUTER VISION - ECCV 2024, PT XXI, 2025, 15079 : 19 - 37
  • [47] CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning
    Meng, Chutong
    Ao, Junyi
    Ko, Tom
    Wang, Mingxuan
    Li, Haizhou
    INTERSPEECH 2023, 2023, : 2978 - 2982
  • [48] Mobility-aware federated self-supervised learning in vehicular network
    Xueying Gu
    Qiong Wu
    Qiang Fan
    Pingyi Fan
    Urban Lifeline, 2 (1):
  • [49] Time Interval Aware Collaborative Sequential Recommendation with Self-supervised Learning
    Ma, Chenrui
    Li, Li
    Chen, Rui
    Li, Xi
    Wang, Yichen
    WEB AND BIG DATA, PT III, APWEB-WAIM 2022, 2023, 13423 : 87 - 101
  • [50] OBJECT-AWARE SELF-SUPERVISED MULTI-LABEL LEARNING
    Xu Kaixin
    Liu Liyang
    Zhao Ziyuan
    Zeng, Zeng
    Veeravalli, Bharadwaj
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 361 - 365