COMBINING UNSUPERVISED AND TEXT AUGMENTED SEMI-SUPERVISED LEARNING FOR LOW RESOURCED AUTOREGRESSIVE SPEECH RECOGNITION

被引：1

作者：

Li, Chak-Fai ^{[1
]}

Keith, Francis ^{[1
]}

Hartmann, William ^{[1
]}

Snover, Matthew ^{[1
]}

机构：

[1] Raytheon BBN Technol, Cambridge, MA 02138 USA

来源：

2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2022年

关键词：

seq2seq; unsupervised learning; semi-supervised training; domain adaptation; REPRESENTATION;

D O I：

10.1109/ICASSP43922.2022.9747005

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Recent advances in unsupervised representation learning have demonstrated the impact of pretraining on large amounts of read speech. We adapt these techniques for domain adaptation in low-resource-both in terms of data and compute-conversational and broadcast domains. Moving beyond CTC, we pretrain state-of-the-art Conformer models in an unsupervised manner. While the unsupervised approach outperforms traditional semi-supervised training, the techniques are complementary. Combining the techniques is a 5% absolute improvement in WER, averaged over all conditions, compared to semi-supervised training alone. Additional text data is incorporated through external language models. By using CTC-based decoding, we are better able to take advantage of the additional text data. When used as a transcription model, it allows the Conformer model to better incorporate the knowledge from the language model through semi-supervised training than shallow fusion. Final performance is an additional 2% better absolute when using CTC-based decoding for semi-supervised training compared to shallow fusion.

引用

页码：6892 / 6896

页数：5

共 44 条

[41] A Semi-Supervised Transfer Learning with Dynamic Associate Domain Adaptation for Human Activity Recognition Using WiFi Signals
Chen, Yuh-Shyan
Chang, Yu-Chi
Li, Chun-Yu
SENSORS, 2021, 21 (24)
[42] Unsupervised Cross-Modality Domain Adaptation for Vestibular Schwannoma Segmentation and Koos Grade Prediction Based on Semi-supervised Contrastive Learning
Han, Luyi
Huang, Yunzhi
Tan, Tao
Mann, Ritse
BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, BRAINLES 2022, PT II, 2023, 14092 : 49 - 58
[43] Pseudo Label Association and Prototype-Based Invariant Learning for Semi-Supervised NIR-VIS Face Recognition
Hu, Weipeng
Yang, Yiming
Hu, Haifeng
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 1448 - 1463
[44] IFF-WAV2VEC: Noise Robust Low-Resource Speech Recognition Based on Self-supervised Learning and Interactive Feature Fusion
Cao, Jing
Qian, Zhaopeng
Yu, Chongchong
Xie, Tao
PROCEEDINGS OF 2023 6TH ARTIFICIAL INTELLIGENCE AND CLOUD COMPUTING CONFERENCE, AICCC 2023, 2023, : 232 - 237

← 1 2 3 4 5 →