Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training

被引:0
|
作者
Zhang, Bowen [1 ]
Cao, Songjun [2 ]
Zhang, Xiaoming [2 ]
Zhang, Yike [2 ]
Ma, Long [2 ]
Shinozaki, Takahiro [1 ]
机构
[1] Tokyo Inst Technol, Tokyo, Japan
[2] Tencent Cloud Xiaowei, Beijing, Peoples R China
来源
INTERSPEECH 2022 | 2022年
关键词
semi-supervised learning; speech recognition; self-training; pseudo-labeling; curriculum learning;
D O I
10.21437/Interspeech.2022-10226
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recent studies have shown that the benefits provided by self-supervised pre-training and self-training (pseudo-labeling) are complementary. Semi-supervised fine-tuning strategies under the pre-training framework, however, remain insufficiently studied. Besides, modern semi-supervised speech recognition algorithms either treat unlabeled data indiscriminately or filter out noisy samples with a confidence threshold. The dissimilarities among different unlabeled data are often ignored. In this paper, we propose Censer, a semi-supervised speech recognition algorithm based on self-supervised pre-training to maximize the utilization of unlabeled data. The pre-training stage of Censer adopts wav2vec2.0 and the fine-tuning stage employs an improved semi-supervised learning algorithm from slimIPL, which leverages unlabeled data progressively according to their pseudo labels' qualities. We also incorporate a temporal pseudo label pool and an exponential moving average to control the pseudo labels' update frequency and to avoid model divergence. Experimental results on Libri-Light and LibriSpeech datasets manifest our proposed method achieves better performance compared to existing approaches while being more unified.
引用
收藏
页码:2653 / 2657
页数:5
相关论文
共 50 条
  • [1] Comparing Self-Supervised Pre-Training and Semi-Supervised Training for Speech Recognition in Languages with Weak Language Models
    Lam-Yee-Mui, Lea-Marie
    Yang, Lucas Ondel
    Klejch, Ondrej
    INTERSPEECH 2023, 2023, : 87 - 91
  • [2] Self-supervised Pre-training and Semi-supervised Learning for Extractive Dialog Summarization
    Zhuang, Yingying
    Song, Jiecheng
    Sadagopan, Narayanan
    Beniwal, Anurag
    COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023, 2023, : 1069 - 1076
  • [3] GUIDED CONTRASTIVE SELF-SUPERVISED PRE-TRAINING FOR AUTOMATIC SPEECH RECOGNITION
    Khare, Aparna
    Wu, Minhua
    Bhati, Saurabhchand
    Droppo, Jasha
    Maas, Roland
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 174 - 181
  • [4] A debiased self-training framework with graph self-supervised pre-training aided for semi-supervised rumor detection
    Qiao, Yuhan
    Cui, Chaoqun
    Wang, Yiying
    Jia, Caiyan
    NEUROCOMPUTING, 2024, 604
  • [5] AN ADAPTER BASED PRE-TRAINING FOR EFFICIENT AND SCALABLE SELF-SUPERVISED SPEECH REPRESENTATION LEARNING
    Kessler, Samuel
    Thomas, Bethan
    Karout, Salah
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3179 - 3183
  • [6] A NOISE-ROBUST SELF-SUPERVISED PRE-TRAINING MODEL BASED SPEECH REPRESENTATION LEARNING FOR AUTOMATIC SPEECH RECOGNITION
    Zhu, Qiu-Shi
    Zhang, Jie
    Zhang, Zi-Qiang
    Wu, Ming-Hui
    Fang, Xin
    Dai, Li-Rong
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3174 - 3178
  • [7] Reducing Domain mismatch in Self-supervised speech pre-training
    Baskar, Murali Karthick
    Rosenberg, Andrew
    Ramabhadran, Bhuvana
    Zhang, Yu
    INTERSPEECH 2022, 2022, : 3028 - 3032
  • [8] Self-supervised ECG pre-training
    Liu, Han
    Zhao, Zhenbo
    She, Qiang
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2021, 70
  • [9] DialogueBERT: A Self-Supervised Learning based Dialogue Pre-training Encoder
    Zhang, Zhenyu
    Guo, Tao
    Chen, Meng
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3647 - 3651
  • [10] SSGait: enhancing gait recognition via semi-supervised self-supervised learning
    Xi, Hao
    Ren, Kai
    Lu, Peng
    Li, Yongqiang
    Hu, Chuanping
    APPLIED INTELLIGENCE, 2024, 54 (07) : 5639 - 5657