Lithuanian Broadcast Speech Transcription using Semi-supervised Acoustic Model Training

被引:13
作者
Lileikyte, Rasa [1 ]
Gorin, Arseniy [1 ]
Lamel, Lori [1 ]
Gauvain, Jean-Luc [1 ]
Fraga-Silva, Thiago [2 ]
机构
[1] Univ Paris Saclay, CNRS, LIMSI, 508 Campus Univ, F-91405 Orsay, France
[2] Vocapia Res, 28 Rue Jean Rostand, F-91400 Orsay, France
来源
SLTU-2016 5TH WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGIES FOR UNDER-RESOURCED LANGUAGES | 2016年 / 81卷
关键词
Automatic speech recognition; Low-resourced languages; Semi-supervised training; Neural networks; Lithuanian language; RECOGNITION;
D O I
10.1016/j.procs.2016.04.037
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper reports on an experimental work to build a speech transcription system for Lithuanian broadcast data, relying on unsupervised and semi-supervised training methods as well as on other low-knowledge methods to compensate for missing resources. Unsupervised acoustic model training is investigated using 360 hours of untranscribed speech data. A graphemic pronunciation approach is used to simplify the pronunciation model generation and therefore ease the language model adaptation for the system users. Discriminative training on top of semi-supervised training is also investigated, as well as various types of acoustic features and their combinations. Experimental results are provided for each of our development steps as well as contrastive results comparing various options. Using the best system configuration a word error rate of 18.3% is obtained on a set of development data from the Quaero program. (C) 2016 The Authors. Published by Elsevier B.V.
引用
收藏
页码:107 / 113
页数:7
相关论文
共 50 条
  • [31] Exploiting semi-supervised training through a dropout regularization in end-to-end speech recognition
    Dey, Subhadeep
    Motlicek, Petr
    Bui, Trung
    Dernoncourt, Franck
    INTERSPEECH 2019, 2019, : 734 - 738
  • [32] End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning
    Tanaka, Tomohiro
    Masumura, Ryo
    Ihori, Mana
    Takashima, Akihiko
    Orihashi, Shota
    Makishima, Naoki
    INTERSPEECH 2021, 2021, : 4458 - 4462
  • [33] Combination of Multilingual and Semi-Supervised Training for Under-Resourced Languages
    Grezl, Frantisek
    Karafiat, Martin
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 820 - 824
  • [34] Virtual Adversarial Training for Semi-supervised Verification Tasks
    Noroozi, Vahid
    Bahaadini, Sara
    Zheng, Lei
    Xie, Sihong
    Yu, Philip S.
    2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
  • [35] Semi-supervised DNN training with word selection for ASR
    Vesely, Karel
    Burget, Lukas
    Cernocky, Jan Honza
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3687 - 3691
  • [36] Unsupervised and semi-supervised adaptation of a hybrid speech recognition system
    Trmal, Jan
    Zelinka, Jan
    Mueller, Ludek
    PROCEEDINGS OF 2012 IEEE 11TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) VOLS 1-3, 2012, : 527 - 530
  • [37] Semi-supervised Adaptation of Assistant Based Speech Recognition Models for different Approach Areas
    Kleinert, Matthias
    Helmke, Hartmut
    Siol, Gerald
    Ehr, Heiko
    Cerna, Aneta
    Kern, Christian
    Klakow, Dietrich
    Motlicek, Petr
    Oualil, Youssef
    Singh, Mittul
    Srinivasamurthy, Ajay
    2018 IEEE/AIAA 37TH DIGITAL AVIONICS SYSTEMS CONFERENCE (DASC), 2018, : 679 - 688
  • [38] Semi-supervised Training of a Voice Conversion Mapping Function using a Joint-Autoencoder
    Mohammadi, Seyed Hamidreza
    Kain, Alexander
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 284 - 288
  • [39] Two Semi-Supervised Training Approaches for Automated Text Recognition
    Leifert, Gundram
    Labahn, Roger
    Sanchez, Joan Andreu
    2020 17TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2020), 2020, : 145 - 150
  • [40] SEMI-SUPERVISED AND POPULATION BASED TRAINING FOR VOICE COMMANDS RECOGNITION
    Elibol, Oguz H.
    Keskin, Gokce
    Thomas, Anil
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6371 - 6375