COMPARISON OF SELF-SUPERVISED SPEECH PRE-TRAINING METHODS ON FLEMISH DUTCH

被引:1
作者
Poncelet, Jakob [1 ]
Hamme, Hugo Van [1 ]
机构
[1] Katholieke Univ Leuven, Dept Elect Engn ESAT PSI, Kasteelpk Arenberg 10,Bus 2441, B-3001 Leuven, Belgium
来源
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU) | 2021年
关键词
speech recognition; self-supervised learning; pre-training; cross-lingual;
D O I
10.1109/ASRU51503.2021.9688061
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent research in speech processing exhibits a growing interest in unsupervised and self-supervised representation learning from unlabelled data to alleviate the need for large amounts of annotated data. We investigate several popular pre-training methods and apply them to Flemish Dutch. We compare off-the-shelf English pre-trained models to models trained on an increasing amount of Flemish data. We find that the most important factors for positive transfer to downstream speech recognition tasks include a substantial amount of data and a matching pre-training domain. Ideally, we also finetune on an annotated subset in the target language. All pre-trained models improve linear phone separability in Flemish, but not all methods improve Automatic Speech Recognition. We experience superior performance with wav2vec 2.0 and we obtain a 30% WER improvement by finetuning the multilingually pre-trained XLSR-53 model on Flemish Dutch, after integration into an HMM-DNN acoustic model.
引用
收藏
页码:169 / 176
页数:8
相关论文
共 50 条
  • [31] Incorporation of Iterative Self-supervised Pre-training in the Creation of the ASR System for the Tatar Language
    Khusainov, Aidar
    Suleymanov, Dzhavdet
    Muhametzyanov, Ilnur
    TEXT, SPEECH, AND DIALOGUE, TSD 2021, 2021, 12848 : 481 - 488
  • [32] SPot-the-Difference Self-supervised Pre-training for Anomaly Detection and Segmentation
    Zou, Yang
    Jeong, Jongheon
    Pemula, Latha
    Zhang, Dongqing
    Dabeer, Onkar
    COMPUTER VISION - ECCV 2022, PT XXX, 2022, 13690 : 392 - 408
  • [33] Masked Deformation Modeling for Volumetric Brain MRI Self-Supervised Pre-Training
    Lyu, Junyan
    Bartlett, Perry F.
    Nasrallah, Fatima A.
    Tang, Xiaoying
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2025, 44 (03) : 1596 - 1607
  • [34] Self-supervised depth super-resolution with contrastive multiview pre-training
    Qiao, Xin
    Ge, Chenyang
    Zhao, Chaoqiang
    Tosi, Fabio
    Poggi, Matteo
    Mattoccia, Stefano
    NEURAL NETWORKS, 2023, 168 : 223 - 236
  • [35] Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with Academic Compute
    Chen, William
    Chang, Xuankai
    Peng, Yifan
    Ni, Zhaoheng
    Maiti, Soumi
    Watanabe, Shinji
    INTERSPEECH 2023, 2023, : 4404 - 4408
  • [36] A SELF-SUPERVISED PRE-TRAINING FRAMEWORK FOR VISION-BASED SEIZURE CLASSIFICATION
    Hou, Jen-Cheng
    McGonigal, Aileen
    Bartolomei, Fabrice
    Thonnat, Monique
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1151 - 1155
  • [37] Mutual information-driven self-supervised point cloud pre-training
    Xu, Weichen
    Fu, Tianhao
    Cao, Jian
    Zhao, Xinyu
    Xu, Xinxin
    Cao, Xixin
    Zhang, Xing
    KNOWLEDGE-BASED SYSTEMS, 2025, 307
  • [38] Self-supervised pre-training improves fundus image classification for diabetic retinopathy
    Lee, Joohyung
    Lee, Eung-Joo
    REAL-TIME IMAGE PROCESSING AND DEEP LEARNING 2022, 2022, 12102
  • [39] A debiased self-training framework with graph self-supervised pre-training aided for semi-supervised rumor detection
    Qiao, Yuhan
    Cui, Chaoqun
    Wang, Yiying
    Jia, Caiyan
    NEUROCOMPUTING, 2024, 604
  • [40] SELF-TRAINING AND PRE-TRAINING ARE COMPLEMENTARY FOR SPEECH RECOGNITION
    Xu, Qiantong
    Baevski, Alexei
    Likhomanenko, Tatiana
    Tomasello, Paden
    Conneau, Alexis
    Collobert, Ronan
    Synnaeve, Gabriel
    Auli, Michael
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3030 - 3034