Lithuanian Broadcast Speech Transcription using Semi-supervised Acoustic Model Training

被引：13

作者：

Lileikyte, Rasa ^{[1
]}

Gorin, Arseniy ^{[1
]}

Lamel, Lori ^{[1
]}

Gauvain, Jean-Luc ^{[1
]}

Fraga-Silva, Thiago ^{[2
]}

机构：

[1] Univ Paris Saclay, CNRS, LIMSI, 508 Campus Univ, F-91405 Orsay, France

[2] Vocapia Res, 28 Rue Jean Rostand, F-91400 Orsay, France

来源：

SLTU-2016 5TH WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGIES FOR UNDER-RESOURCED LANGUAGES | 2016年 / 81卷

关键词：

Automatic speech recognition; Low-resourced languages; Semi-supervised training; Neural networks; Lithuanian language; RECOGNITION;

D O I：

10.1016/j.procs.2016.04.037

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

This paper reports on an experimental work to build a speech transcription system for Lithuanian broadcast data, relying on unsupervised and semi-supervised training methods as well as on other low-knowledge methods to compensate for missing resources. Unsupervised acoustic model training is investigated using 360 hours of untranscribed speech data. A graphemic pronunciation approach is used to simplify the pronunciation model generation and therefore ease the language model adaptation for the system users. Discriminative training on top of semi-supervised training is also investigated, as well as various types of acoustic features and their combinations. Experimental results are provided for each of our development steps as well as contrastive results comparing various options. Using the best system configuration a word error rate of 18.3% is obtained on a set of development data from the Quaero program. (C) 2016 The Authors. Published by Elsevier B.V.

引用

页码：107 / 113

页数：7

共 50 条

[1] Semi-supervised acoustic model training for speech with code-switching
Yilmaz, Emre
McLaren, Mitchell
van den Heuvel, Henk
van Leeuwen, David A.
SPEECH COMMUNICATION, 2018, 105 : 12 - 22
[2] Semi-Supervised Speech Recognition Acoustic Model Training Using Policy Gradient
Chung, Hoon
Lee, Sung Joo
Jeon, Hyeong Bae
Park, Jeon Gue
APPLIED SCIENCES-BASEL, 2020, 10 (10):
[3] Semi-Supervised Training of DNN-Based Acoustic Model for ATC Speech Recognition
Smidl, Lubos
Svec, Jan
Prazak, Ales
Trmal, Jan
SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 646 - 655
[4] SEMI-SUPERVISED ENSEMBLE DNN ACOUSTIC MODEL TRAINING
Li, Sheng
Lu, Xugang
Sakai, Shinsuke
Mimura, Masato
Kawahara, Tatsuya
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5270 - 5274
[5] LANGUAGE DIARIZATION FOR SEMI-SUPERVISED BILINGUAL ACOUSTIC MODEL TRAINING
Yilmaz, Emre
McLaren, Mitchell
van den Heuvel, Henk
van Leeuwen, David A.
2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 91 - 96
[6] Acoustic Model Bootstrapping Using Semi-Supervised Learning
Chen, Langzhou
Leutnant, Volker
INTERSPEECH 2019, 2019, : 3198 - 3202
[7] SEMI-SUPERVISED TRAINING OF ACOUSTIC MODELS USING LATTICE-FREE MMI
Manohar, Vimal
Hadian, Hossein
Povey, Daniel
Khudanpur, Sanjeev
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4844 - 4848
[8] An exploration of semi-supervised and language-adversarial transfer learning using hybrid acoustic model for hindi speech recognition
Kumar A.
Aggarwal R.K.
Journal of Reliable Intelligent Environments, 2022, 8 (02) : 117 - 132
[9] Semi-supervised and unsupervised discriminative language model training for automatic speech recognition
Dikici, Erinc
Saraclar, Murat
SPEECH COMMUNICATION, 2016, 83 : 54 - 63
[10] Semi-supervised Cross-domain Visual Feature Learning for Audio-Visual Broadcast Speech Transcription
Su, Rongfeng
Liu, Xunying
Wang, Lan
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3509 - 3513

← 1 2 3 4 5 →