Noisy Student Teacher Training with Self Supervised Learning for Children ASR

被引:1
|
作者
Chaturvedi, Shreya S. [1 ]
Sailor, Hardik B. [2 ,3 ]
Patil, Hemant A. [1 ]
机构
[1] DA IICT, Speech Res Lab, Gandhinagar, India
[2] ASTAR, Inst Infocomm Res I2R, Singapore, Singapore
[3] Samsung R&D Inst, Bangalore, Karnataka, India
来源
2022 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS, SPCOM | 2022年
关键词
D O I
10.1109/SPCOM55316.2022.9840763
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Automatic Speech Recognition (ASR) is a fast-growing field, where reliable systems are made for high resource languages and for adult's speech. However, performance of such ASR system is inefficient for children speech, due to numerous acoustic variability in children speech and scarcity of resources. In this paper, we propose to use the unlabeled data extensively to develop ASR system for low resourced children speech. State-of-the-art wav2vec 2.0 is the baseline ASR technique used here. The baseline's performance is further enhanced with the intuition of Noisy Student Teacher (NST) learning. The proposed technique is not only limited to introducing the use of soft labels (i.e., word-level transcription) of unlabeled data, but also adapts the learning of teacher model or preceding student model, which results in reduction of the redundant training significantly. To that effect, a detailed analysis is reported in this paper, as there is a difference in teacher and student learning. In ASR experiments, character-level tokenization was used and hence, Connectionist Temporal Classification (CTC) loss was used for fine-tuning. Due to computational limitations, experiments are performed with approximately 12 hours of training, and 5 hours of development and test data was used from standard My Science Tutor (My-ST) corpus. The baseline wav2vec 2.0 achieves 34% WER, while relatively 10% of performance was improved using the proposed approach. Further, the analysis of performance loss and effect of language model is discussed in details.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Teacher/Student Deep Semi-Supervised Learning for Training with Noisy Labels
    Hailat, Zeyad
    Chen, Xue-Wen
    2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2018, : 907 - 912
  • [2] Distantly Supervised Biomedical Relation Extraction via Negative Learning and Noisy Student Self-Training
    Dai, Yuanfei
    Zhang, Bin
    Wang, Shiping
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2024, 21 (06) : 1697 - 1708
  • [3] ASBERT: ASR-SPECIFIC SELF-SUPERVISED LEARNING WITH SELF-TRAINING
    Kim, Hyung Yong
    Kim, Byeong-Yeol
    Yoo, Seung Woo
    Lim, Youshin
    Lim, Yunkyu
    Lee, Hanbin
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 9 - 14
  • [4] Dynamic Self-Supervised Teacher-Student Network Learning
    Ye, Fei
    Bors, Adrian G.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (05) : 5731 - 5748
  • [5] Biased Self-supervised learning for ASR
    Kreyssig, Florian L.
    Shi, Yangyang
    Guo, Jinxi
    Sari, Leda
    Mohamed, Abdelrahman
    Woodland, Philip C.
    INTERSPEECH 2023, 2023, : 4948 - 4952
  • [6] LEARNING BETWEEN DIFFERENT TEACHER AND STUDENT MODELS IN ASR
    Wong, Jeremy H. M.
    Gales, Mark J. F.
    Wang, Yu
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 93 - 99
  • [7] Learning When to Trust Which Teacher forWeakly Supervised ASR
    Agrawal, Aakriti
    Rao, Milind
    Sahu, Anit Kumar
    Chennupati, Gopinath
    Stolcke, Andreas
    INTERSPEECH 2023, 2023, : 381 - 385
  • [8] On the Learning Dynamics of Semi-Supervised Training for ASR
    Wallington, Electra
    Kershenbaum, Benji
    Klejch, Ondrej
    Bell, Peter
    INTERSPEECH 2021, 2021, : 716 - 720
  • [9] Semi-supervised Entity Alignment via Noisy Student-Based Self Training
    Liu, Yihe
    Dai, Yuanfei
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT II, KSEM 2023, 2023, 14118 : 343 - 354
  • [10] COMBINING SELF-SUPERVISED AND SUPERVISED LEARNING WITH NOISY LABELS
    Zhang, Yongqi
    Zhang, Hui
    Yao, Quanming
    Wan, Jun
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 605 - 609