COMPARISON OF SELF-SUPERVISED SPEECH PRE-TRAINING METHODS ON FLEMISH DUTCH

被引:1
|
作者
Poncelet, Jakob [1 ]
Hamme, Hugo Van [1 ]
机构
[1] Katholieke Univ Leuven, Dept Elect Engn ESAT PSI, Kasteelpk Arenberg 10,Bus 2441, B-3001 Leuven, Belgium
来源
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU) | 2021年
关键词
speech recognition; self-supervised learning; pre-training; cross-lingual;
D O I
10.1109/ASRU51503.2021.9688061
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent research in speech processing exhibits a growing interest in unsupervised and self-supervised representation learning from unlabelled data to alleviate the need for large amounts of annotated data. We investigate several popular pre-training methods and apply them to Flemish Dutch. We compare off-the-shelf English pre-trained models to models trained on an increasing amount of Flemish data. We find that the most important factors for positive transfer to downstream speech recognition tasks include a substantial amount of data and a matching pre-training domain. Ideally, we also finetune on an annotated subset in the target language. All pre-trained models improve linear phone separability in Flemish, but not all methods improve Automatic Speech Recognition. We experience superior performance with wav2vec 2.0 and we obtain a 30% WER improvement by finetuning the multilingually pre-trained XLSR-53 model on Flemish Dutch, after integration into an HMM-DNN acoustic model.
引用
收藏
页码:169 / 176
页数:8
相关论文
共 50 条
  • [1] Reducing Domain mismatch in Self-supervised speech pre-training
    Baskar, Murali Karthick
    Rosenberg, Andrew
    Ramabhadran, Bhuvana
    Zhang, Yu
    INTERSPEECH 2022, 2022, : 3028 - 3032
  • [2] Self-supervised ECG pre-training
    Liu, Han
    Zhao, Zhenbo
    She, Qiang
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2021, 70
  • [3] Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
    Rouditchenko, Andrew
    Khurana, Sameer
    Thomas, Samuel
    Feris, Rogerio
    Karlinsky, Leonid
    Kuehne, Hilde
    Harwath, David
    Kingsbury, Brian
    Glass, James
    INTERSPEECH 2023, 2023, : 2268 - 2272
  • [4] Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training
    Huang, Sung-Feng
    Chuang, Shun-Po
    Liu, Da-Rong
    Chen, Yi-Chen
    Yang, Gene-Ping
    Lee, Hung-yi
    INTERSPEECH 2021, 2021, : 3056 - 3060
  • [5] GUIDED CONTRASTIVE SELF-SUPERVISED PRE-TRAINING FOR AUTOMATIC SPEECH RECOGNITION
    Khare, Aparna
    Wu, Minhua
    Bhati, Saurabhchand
    Droppo, Jasha
    Maas, Roland
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 174 - 181
  • [6] Self-supervised Pre-training of Text Recognizers
    Kiss, Martin
    Hradis, Michal
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT IV, 2024, 14807 : 218 - 235
  • [7] Self-supervised Pre-training for Mirror Detection
    Lin, Jiaying
    Lau, Rynson W. H.
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 12193 - 12202
  • [8] Self-supervised Pre-training for Nuclei Segmentation
    Haq, Mohammad Minhazul
    Huang, Junzhou
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT II, 2022, 13432 : 303 - 313
  • [9] EFFECTIVENESS OF SELF-SUPERVISED PRE-TRAINING FOR ASR
    Baevski, Alexei
    Mohamed, Abdelrahman
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7694 - 7698
  • [10] Investigating Self-supervised Pre-training for End-to-end Speech Translation
    Ha Nguyen
    Bougares, Fethi
    Tomashenko, Natalia
    Esteve, Yannick
    Besacier, Laurent
    INTERSPEECH 2020, 2020, : 1466 - 1470