COMPARISON OF SELF-SUPERVISED SPEECH PRE-TRAINING METHODS ON FLEMISH DUTCH

被引：1

作者：

Poncelet, Jakob ^{[1
]}

Hamme, Hugo Van ^{[1
]}

机构：

[1] Katholieke Univ Leuven, Dept Elect Engn ESAT PSI, Kasteelpk Arenberg 10,Bus 2441, B-3001 Leuven, Belgium

来源：

2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU) | 2021年

关键词：

speech recognition; self-supervised learning; pre-training; cross-lingual;

D O I：

10.1109/ASRU51503.2021.9688061

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent research in speech processing exhibits a growing interest in unsupervised and self-supervised representation learning from unlabelled data to alleviate the need for large amounts of annotated data. We investigate several popular pre-training methods and apply them to Flemish Dutch. We compare off-the-shelf English pre-trained models to models trained on an increasing amount of Flemish data. We find that the most important factors for positive transfer to downstream speech recognition tasks include a substantial amount of data and a matching pre-training domain. Ideally, we also finetune on an annotated subset in the target language. All pre-trained models improve linear phone separability in Flemish, but not all methods improve Automatic Speech Recognition. We experience superior performance with wav2vec 2.0 and we obtain a 30% WER improvement by finetuning the multilingually pre-trained XLSR-53 model on Flemish Dutch, after integration into an HMM-DNN acoustic model.

引用

页码：169 / 176

页数：8

共 50 条

[31] Incorporation of Iterative Self-supervised Pre-training in the Creation of the ASR System for the Tatar Language
Khusainov, Aidar
Suleymanov, Dzhavdet
Muhametzyanov, Ilnur
TEXT, SPEECH, AND DIALOGUE, TSD 2021, 2021, 12848 : 481 - 488
[32] SPot-the-Difference Self-supervised Pre-training for Anomaly Detection and Segmentation
Zou, Yang
Jeong, Jongheon
Pemula, Latha
Zhang, Dongqing
Dabeer, Onkar
COMPUTER VISION - ECCV 2022, PT XXX, 2022, 13690 : 392 - 408
[33] Masked Deformation Modeling for Volumetric Brain MRI Self-Supervised Pre-Training
Lyu, Junyan
Bartlett, Perry F.
Nasrallah, Fatima A.
Tang, Xiaoying
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2025, 44 (03) : 1596 - 1607
[34] Self-supervised depth super-resolution with contrastive multiview pre-training
Qiao, Xin
Ge, Chenyang
Zhao, Chaoqiang
Tosi, Fabio
Poggi, Matteo
Mattoccia, Stefano
NEURAL NETWORKS, 2023, 168 : 223 - 236
[35] Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with Academic Compute
Chen, William
Chang, Xuankai
Peng, Yifan
Ni, Zhaoheng
Maiti, Soumi
Watanabe, Shinji
INTERSPEECH 2023, 2023, : 4404 - 4408
[36] A SELF-SUPERVISED PRE-TRAINING FRAMEWORK FOR VISION-BASED SEIZURE CLASSIFICATION
Hou, Jen-Cheng
McGonigal, Aileen
Bartolomei, Fabrice
Thonnat, Monique
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1151 - 1155
[37] Mutual information-driven self-supervised point cloud pre-training
Xu, Weichen
Fu, Tianhao
Cao, Jian
Zhao, Xinyu
Xu, Xinxin
Cao, Xixin
Zhang, Xing
KNOWLEDGE-BASED SYSTEMS, 2025, 307
[38] Self-supervised pre-training improves fundus image classification for diabetic retinopathy
Lee, Joohyung
Lee, Eung-Joo
REAL-TIME IMAGE PROCESSING AND DEEP LEARNING 2022, 2022, 12102
[39] A debiased self-training framework with graph self-supervised pre-training aided for semi-supervised rumor detection
Qiao, Yuhan
Cui, Chaoqun
Wang, Yiying
Jia, Caiyan
NEUROCOMPUTING, 2024, 604
[40] SELF-TRAINING AND PRE-TRAINING ARE COMPLEMENTARY FOR SPEECH RECOGNITION
Xu, Qiantong
Baevski, Alexei
Likhomanenko, Tatiana
Tomasello, Paden
Conneau, Alexis
Collobert, Ronan
Synnaeve, Gabriel
Auli, Michael
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3030 - 3034

← 1 2 3 4 5 →