Learning When to Trust Which Teacher forWeakly Supervised ASR

被引:0
作者
Agrawal, Aakriti [1 ,2 ]
Rao, Milind [2 ]
Sahu, Anit Kumar [2 ]
Chennupati, Gopinath [2 ]
Stolcke, Andreas [2 ]
机构
[1] Univ Maryland, College Pk, MD 20742 USA
[2] Amazon Alexa AI, Bellevue, WA USA
来源
INTERSPEECH 2023 | 2023年
关键词
ASR; teacher-student training; semi-supervised learning; self-supervised learning; ROVER;
D O I
10.21437/Interspeech.2023-2205
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Automatic speech recognition (ASR) training can utilize multiple experts as teacher models, each trained on a specific domain or accent. Teacher models may be opaque in nature since their architecture may be not be known or their training cadence is different from that of the student ASR model. Still, the student models are updated incrementally using the pseudo-labels generated independently by the expert teachers. In this paper, we exploit supervision from multiple domain experts in training student ASR models. This training strategy is especially useful in scenarios where few or no human transcriptions are available. To that end, we propose a Smart-Weighter mechanism that selects an appropriate expert based on the input audio, and then trains the student model in an unsupervised setting. We show the efficacy of our approach using LibriSpeech and LibriLight benchmarks and find an improvement of 4 to 25% over baselines that uniformly weight all the experts, use a single expert model, or combine experts using ROVER.
引用
收藏
页码:381 / 385
页数:5
相关论文
共 48 条
[31]   Semi-supervised federated learning fault diagnosis method driven by teacher-student model consistency [J].
Wang, Guilong ;
Pu, Chenjie ;
Fu, Dongliang ;
Zhang, Yi ;
Yu, Jiongmin ;
Hou, Yanru .
SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (05)
[32]   Semi-Supervised Learning-Enhanced Fingerprint Indoor Positioning by Exploiting an Adapted Mean Teacher Model [J].
Chen, Peng ;
Liu, Yingzhi ;
Li, Wei ;
Wang, Jingyi ;
Wang, Jianxiu ;
Yang, Bei ;
Feng, Gang .
ELECTRONICS, 2024, 13 (02)
[33]   Consistency regularization teacher-student semi-supervised learning method for target recognition in SAR images [J].
Tian, Ye ;
Zhang, Liguo ;
Sun, Jianguo ;
Yin, Guisheng ;
Dong, Yuxin .
VISUAL COMPUTER, 2022, 38 (12) :4179-4192
[34]   Which Augmentation Should I Use? An Empirical Investigation of Augmentations for Self-Supervised Phonocardiogram Representation Learning [J].
Ballas, Aristotelis ;
Papapanagiotou, Vasileios ;
Diou, Christos .
IEEE ACCESS, 2024, 12 :193459-193472
[35]   LABERT: A Combination of Local Aggregation and Self-Supervised Speech Representation Learning for Detecting Informative Hidden Units in Low-Resource ASR Systems [J].
Fatehi, Kavan ;
Kucukyilmaz, Ayse .
INTERSPEECH 2023, 2023, :211-215
[36]   Semi-supervised Strong-Teacher Consistency Learning for few-shot cardiac MRI image segmentation [J].
Qiu, Yuting ;
Meng, James ;
Li, Baihua .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2025, 261
[37]   Domain Knowledge Adapted Semi-supervised Learning with Mean-Teacher Strategy for Circulating Abnormal Cells Identification [J].
Wang, Huajia ;
Kuang, Yinglan ;
Fan, Xianjun ;
Zhou, Yanling ;
Ye, Xin ;
Lu, Xing .
COMPUTATIONAL MATHEMATICS MODELING IN CANCER ANALYSIS, CMMCA 2023, 2023, 14243 :61-70
[38]   Robust Teacher: Self-correcting pseudo-label-guided semi-supervised learning for object detection [J].
Li, Shijie ;
Liu, Junmin ;
Shen, Weilin ;
Sun, Jianyong ;
Tan, Chengli .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 235
[39]   Semi-supervised learning approach for construction object detection by integrating super-resolution and mean teacher network [J].
Zhang, Wen-Jie ;
Wan, Hua-Ping ;
Hu, Peng-Hua ;
Ge, Hui-Bin ;
Luo, Yaozhi ;
Todd, Michael D. .
JOURNAL OF INFRASTRUCTURE INTELLIGENCE AND RESILIENCE, 2024, 3 (04)
[40]   Semi-TSGAN: Semi-Supervised Learning for Highlight Removal Based on Teacher-Student Generative Adversarial Network [J].
Zheng, Yuanfeng ;
Yan, Yuchen ;
Jiang, Hao .
SENSORS, 2024, 24 (10)