Learning When to Trust Which Teacher forWeakly Supervised ASR

被引：0

作者：

Agrawal, Aakriti ^{[1
,2
]}

Rao, Milind ^{[2
]}

Sahu, Anit Kumar ^{[2
]}

Chennupati, Gopinath ^{[2
]}

Stolcke, Andreas ^{[2
]}

机构：

[1] Univ Maryland, College Pk, MD 20742 USA

[2] Amazon Alexa AI, Bellevue, WA USA

来源：

INTERSPEECH 2023 | 2023年

关键词：

ASR; teacher-student training; semi-supervised learning; self-supervised learning; ROVER;

D O I：

10.21437/Interspeech.2023-2205

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Automatic speech recognition (ASR) training can utilize multiple experts as teacher models, each trained on a specific domain or accent. Teacher models may be opaque in nature since their architecture may be not be known or their training cadence is different from that of the student ASR model. Still, the student models are updated incrementally using the pseudo-labels generated independently by the expert teachers. In this paper, we exploit supervision from multiple domain experts in training student ASR models. This training strategy is especially useful in scenarios where few or no human transcriptions are available. To that end, we propose a Smart-Weighter mechanism that selects an appropriate expert based on the input audio, and then trains the student model in an unsupervised setting. We show the efficacy of our approach using LibriSpeech and LibriLight benchmarks and find an improvement of 4 to 25% over baselines that uniformly weight all the experts, use a single expert model, or combine experts using ROVER.

引用

页码：381 / 385

页数：5

共 47 条

[21] Semi-supervised student-teacher learning for single image super-resolution [J].

Wang, Lin ;

Yoon, Kuk-Jin .

PATTERN RECOGNITION, 2022, 121

[22] Semi-supervised fuzzy broad learning system based on mean-teacher model [J].

Fan, Zizhu ;

Huang, Yijing ;

Xi, Chao ;

Peng, Cheng ;

Wang, Shitong .

PATTERN ANALYSIS AND APPLICATIONS, 2024, 27 (01)

[23] Certainty driven consistency loss on multi-teacher networks for semi-supervised learning [J].

Liu, Lu ;

Tan, Robby T. .

PATTERN RECOGNITION, 2021, 120

[24] When Domain Adaptation Meets Semi-supervised Learning Through Optimal Transport [J].

El Hamri, Mourad ;

Bennani, Younes ;

Falih, Issam .

ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2022, PART I, 2022, 646 :58-69

[25] Self-Supervised Learning for ASR Pre-Training with Uniquely Determined Target Labels and Controlling Cepstrum Truncation for Speech Augmentation [J].

Kato, Akihiro ;

Nagano, Hiroyuki ;

Chike, Kohei ;

Nose, Masaki .

INTERSPEECH 2024, 2024, :5048-5052

[26] Confidence-Weighted Dual-Teacher Networks With Biased Contrastive Learning for Semi-Supervised Semantic Segmentation in Remote Sensing Images [J].

Xin, Yi ;

Fan, Zide ;

Qi, Xiyu ;

Zhang, Yidan ;

Li, Xinming .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 :1-16

[27] Polite Teacher: Semi-Supervised Instance Segmentation With Mutual Learning and Pseudo-Label Thresholding [J].

Filipiak, Dominik ;

Zapala, Andrzej ;

Tempczyk, Piotr ;

Fensel, Anna ;

Cygan, Marek .

IEEE ACCESS, 2024, 12 :37744-37756

[28] Diverse Teacher-Students for deep safe semi-supervised learning under class mismatch [J].

Wang, Qikai ;

He, Rundong ;

Gong, Yongshun ;

Ren, Chunxiao ;

Sun, Haoliang ;

Huang, Xiaoshui ;

Yin, Yilong .

NEURAL NETWORKS, 2025, 187

[29] Consistency regularization teacher–student semi-supervised learning method for target recognition in SAR images [J].

Ye Tian ;

Liguo Zhang ;

Jianguo Sun ;

Guisheng Yin ;

Yuxin Dong .

The Visual Computer, 2022, 38 :4179-4192

[30] Cross-modal Self-Supervised Learning for Lip Reading: When Contrastive Learning meets Adversarial Training [J].

Sheng, Changchong ;

Pietikainen, Matti ;

Tian, Qi ;

Liu, Li .

PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, :2456-2464

← 1 2 3 4 5 →