Learning When to Trust Which Teacher forWeakly Supervised ASR

被引:0
作者
Agrawal, Aakriti [1 ,2 ]
Rao, Milind [2 ]
Sahu, Anit Kumar [2 ]
Chennupati, Gopinath [2 ]
Stolcke, Andreas [2 ]
机构
[1] Univ Maryland, College Pk, MD 20742 USA
[2] Amazon Alexa AI, Bellevue, WA USA
来源
INTERSPEECH 2023 | 2023年
关键词
ASR; teacher-student training; semi-supervised learning; self-supervised learning; ROVER;
D O I
10.21437/Interspeech.2023-2205
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Automatic speech recognition (ASR) training can utilize multiple experts as teacher models, each trained on a specific domain or accent. Teacher models may be opaque in nature since their architecture may be not be known or their training cadence is different from that of the student ASR model. Still, the student models are updated incrementally using the pseudo-labels generated independently by the expert teachers. In this paper, we exploit supervision from multiple domain experts in training student ASR models. This training strategy is especially useful in scenarios where few or no human transcriptions are available. To that end, we propose a Smart-Weighter mechanism that selects an appropriate expert based on the input audio, and then trains the student model in an unsupervised setting. We show the efficacy of our approach using LibriSpeech and LibriLight benchmarks and find an improvement of 4 to 25% over baselines that uniformly weight all the experts, use a single expert model, or combine experts using ROVER.
引用
收藏
页码:381 / 385
页数:5
相关论文
共 48 条
[41]   A GENERATIVE SEMI-SUPERVISED MODEL FOR MULTI-VIEW LEARNING WHEN SOME VIEWS ARE LABEL-FREE [J].
Jin, Gaole ;
Raich, Raviv ;
Miller, David J. .
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, :3302-3306
[42]   CACs Recognition of FISH Images Based on Adaptive Mean Teacher Semi-supervised Learning with Domain-Knowledge Pseudo Label [J].
Weng, Yuqing ;
Hu, Qiuping ;
Wang, Huajia ;
Kuang, Yinglan ;
Zhou, Yanling ;
Tang, Yuyan ;
Wang, Lei ;
Ye, Xin ;
Lu, Xing .
JOURNAL OF IMAGING INFORMATICS IN MEDICINE, 2024,
[43]   Cross-domain self-supervised few-shot learning via multiple crops with teacher-student network [J].
Wang, Guangpeng ;
Wang, Yongxiong ;
Zhang, Jiapeng ;
Wang, Xiaoming ;
Pan, Zhiqun .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 132
[44]   Semi-supervised Learning Framework of Dominant Instability Mode Identification Via Fusion of Virtual Adversarial Training and Mean Teacher Model [J].
Zhang R. ;
Yao W. ;
Shi Z. ;
Tang Y. ;
Wen J. .
Zhongguo Dianji Gongcheng Xuebao/Proceedings of the Chinese Society of Electrical Engineering, 2022, 42 (20) :7497-7508
[45]   SLIP: Self-Supervised Learning Based Model Inversion and Poisoning Detection-Based Zero-Trust Systems for Vehicular Networks [J].
Khowaja, Sunder Ali ;
Nkenyereye, Lewis ;
Khowaja, Parus ;
Dev, Kapal ;
Niyato, Dusit .
IEEE WIRELESS COMMUNICATIONS, 2024, 31 (02) :50-57
[46]   Semi-supervised Learning via Improved Teacher-Student Network for Robust 3D Reconstruction of Stereo Endoscopic Image [J].
Shi, Hongkuan ;
Wang, Zhiwei ;
Lv, Jinxin ;
Wang, Yilang ;
Zhang, Peng ;
Zhu, Fei ;
Li, Qiang .
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, :4661-4669
[47]   A novel Parallel Cooperative Mean-Teacher framework (PCMT) combined with prediction uncertainty guide and class contrastive learning for semi-supervised polyp segmentation [J].
Xia, Yang ;
Yun, Haijiao ;
Liu, Peiyu ;
Li, Mingjing .
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255
[48]   Active deep learning from a noisy teacher for semi-supervised 3D image segmentation: Application to COVID-19 pneumonia infection in CT [J].
Hussain, Mohammad Arafat ;
Mirikharaji, Zahra ;
Momeny, Mohammad ;
Marhamati, Mahmoud ;
Neshat, Ali Asghar ;
Garbi, Rafeef ;
Hamarneh, Ghassan .
COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2022, 102