Lithuanian Broadcast Speech Transcription using Semi-supervised Acoustic Model Training

被引:13
|
作者
Lileikyte, Rasa [1 ]
Gorin, Arseniy [1 ]
Lamel, Lori [1 ]
Gauvain, Jean-Luc [1 ]
Fraga-Silva, Thiago [2 ]
机构
[1] Univ Paris Saclay, CNRS, LIMSI, 508 Campus Univ, F-91405 Orsay, France
[2] Vocapia Res, 28 Rue Jean Rostand, F-91400 Orsay, France
来源
SLTU-2016 5TH WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGIES FOR UNDER-RESOURCED LANGUAGES | 2016年 / 81卷
关键词
Automatic speech recognition; Low-resourced languages; Semi-supervised training; Neural networks; Lithuanian language; RECOGNITION;
D O I
10.1016/j.procs.2016.04.037
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper reports on an experimental work to build a speech transcription system for Lithuanian broadcast data, relying on unsupervised and semi-supervised training methods as well as on other low-knowledge methods to compensate for missing resources. Unsupervised acoustic model training is investigated using 360 hours of untranscribed speech data. A graphemic pronunciation approach is used to simplify the pronunciation model generation and therefore ease the language model adaptation for the system users. Discriminative training on top of semi-supervised training is also investigated, as well as various types of acoustic features and their combinations. Experimental results are provided for each of our development steps as well as contrastive results comparing various options. Using the best system configuration a word error rate of 18.3% is obtained on a set of development data from the Quaero program. (C) 2016 The Authors. Published by Elsevier B.V.
引用
收藏
页码:107 / 113
页数:7
相关论文
共 50 条
  • [21] Semi-Supervised Acoustic Model Training by Discriminative Data Selection From Multiple ASR Systems' Hypotheses
    Li, Sheng
    Akita, Yuya
    Kawahara, Tatsuya
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (09) : 1524 - 1534
  • [22] Semi-supervised Part-of-speech Tagging in Speech Applications
    Dufour, Richard
    Favre, Benoit
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1373 - 1376
  • [23] A Semi-Supervised Complementary Joint Training Approach for Low-Resource Speech Recognition
    Du, Ye-Qian
    Zhang, Jie
    Fang, Xin
    Wu, Ming-Hui
    Yang, Zhou-Wang
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 3908 - 3921
  • [24] Comparing Self-Supervised Pre-Training and Semi-Supervised Training for Speech Recognition in Languages with Weak Language Models
    Lam-Yee-Mui, Lea-Marie
    Yang, Lucas Ondel
    Klejch, Ondrej
    INTERSPEECH 2023, 2023, : 87 - 91
  • [25] DEEP NEURAL NETWORK FEATURES AND SEMI-SUPERVISED TRAINING FOR LOW RESOURCE SPEECH RECOGNITION
    Thomas, Samuel
    Seltzer, Michael L.
    Church, Kenneth
    Hermansky, Hynek
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6704 - 6708
  • [26] Semi-supervised Maximum Mutual Information Training of Deep Neural Network Acoustic Models
    Manohar, Vimal
    Povey, Daniel
    Khudanpur, Sanjeev
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2630 - 2634
  • [27] On the Learning Dynamics of Semi-Supervised Training for ASR
    Wallington, Electra
    Kershenbaum, Benji
    Klejch, Ondrej
    Bell, Peter
    INTERSPEECH 2021, 2021, : 716 - 720
  • [28] SEMI-SUPERVISED TRAINING OF DEEP NEURAL NETWORKS
    Vesely, Karel
    Hannemann, Mirko
    Burget, Lukas
    2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2013, : 267 - 272
  • [29] Unbiased semi-supervised LF-MMI training using dropout
    Tong, Sibo
    Vyas, Apoorv
    Garner, Philip N.
    Bourlard, Herve
    INTERSPEECH 2019, 2019, : 1576 - 1580
  • [30] Semi-Supervised Learning for Spanish Speech Recognition Using Deep Neural Networks
    Rosario Campomanes-Alvarez, Blanca
    Quiros, Pelayo
    Fernandez, Bernardo
    APPLICATIONS OF INTELLIGENT SYSTEMS, 2018, 310 : 19 - 29