Noisy Student Teacher Training with Self Supervised Learning for Children ASR

被引：1

作者：

Chaturvedi, Shreya S. ^{[1
]}

Sailor, Hardik B. ^{[2
,3
]}

Patil, Hemant A. ^{[1
]}

机构：

[1] DA IICT, Speech Res Lab, Gandhinagar, India

[2] ASTAR, Inst Infocomm Res I2R, Singapore, Singapore

[3] Samsung R&D Inst, Bangalore, Karnataka, India

来源：

2022 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS, SPCOM | 2022年

关键词：

D O I：

10.1109/SPCOM55316.2022.9840763

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Automatic Speech Recognition (ASR) is a fast-growing field, where reliable systems are made for high resource languages and for adult's speech. However, performance of such ASR system is inefficient for children speech, due to numerous acoustic variability in children speech and scarcity of resources. In this paper, we propose to use the unlabeled data extensively to develop ASR system for low resourced children speech. State-of-the-art wav2vec 2.0 is the baseline ASR technique used here. The baseline's performance is further enhanced with the intuition of Noisy Student Teacher (NST) learning. The proposed technique is not only limited to introducing the use of soft labels (i.e., word-level transcription) of unlabeled data, but also adapts the learning of teacher model or preceding student model, which results in reduction of the redundant training significantly. To that effect, a detailed analysis is reported in this paper, as there is a difference in teacher and student learning. In ASR experiments, character-level tokenization was used and hence, Connectionist Temporal Classification (CTC) loss was used for fine-tuning. Due to computational limitations, experiments are performed with approximately 12 hours of training, and 5 hours of development and test data was used from standard My Science Tutor (My-ST) corpus. The baseline wav2vec 2.0 achieves 34% WER, while relatively 10% of performance was improved using the proposed approach. Further, the analysis of performance loss and effect of language model is discussed in details.

引用

页数：5

共 50 条

[1] Teacher/Student Deep Semi-Supervised Learning for Training with Noisy Labels
Hailat, Zeyad
Chen, Xue-Wen
2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2018, : 907 - 912
[2] Distantly Supervised Biomedical Relation Extraction via Negative Learning and Noisy Student Self-Training
Dai, Yuanfei
Zhang, Bin
Wang, Shiping
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2024, 21 (06) : 1697 - 1708
[3] ASBERT: ASR-SPECIFIC SELF-SUPERVISED LEARNING WITH SELF-TRAINING
Kim, Hyung Yong
Kim, Byeong-Yeol
Yoo, Seung Woo
Lim, Youshin
Lim, Yunkyu
Lee, Hanbin
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 9 - 14
[4] Dynamic Self-Supervised Teacher-Student Network Learning
Ye, Fei
Bors, Adrian G.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (05) : 5731 - 5748
[5] Biased Self-supervised learning for ASR
Kreyssig, Florian L.
Shi, Yangyang
Guo, Jinxi
Sari, Leda
Mohamed, Abdelrahman
Woodland, Philip C.
INTERSPEECH 2023, 2023, : 4948 - 4952
[6] LEARNING BETWEEN DIFFERENT TEACHER AND STUDENT MODELS IN ASR
Wong, Jeremy H. M.
Gales, Mark J. F.
Wang, Yu
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 93 - 99
[7] Learning When to Trust Which Teacher forWeakly Supervised ASR
Agrawal, Aakriti
Rao, Milind
Sahu, Anit Kumar
Chennupati, Gopinath
Stolcke, Andreas
INTERSPEECH 2023, 2023, : 381 - 385
[8] On the Learning Dynamics of Semi-Supervised Training for ASR
Wallington, Electra
Kershenbaum, Benji
Klejch, Ondrej
Bell, Peter
INTERSPEECH 2021, 2021, : 716 - 720
[9] Semi-supervised Entity Alignment via Noisy Student-Based Self Training
Liu, Yihe
Dai, Yuanfei
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT II, KSEM 2023, 2023, 14118 : 343 - 354
[10] COMBINING SELF-SUPERVISED AND SUPERVISED LEARNING WITH NOISY LABELS
Zhang, Yongqi
Zhang, Hui
Yao, Quanming
Wan, Jun
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 605 - 609

← 1 2 3 4 5 →