Semi-Supervised Training of DNN-Based Acoustic Model for ATC Speech Recognition

被引:12
|
作者
Smidl, Lubos [1 ]
Svec, Jan [2 ]
Prazak, Ales [1 ]
Trmal, Jan [3 ]
机构
[1] Univ West Bohemia, Dept Cybernet, Plzen, Czech Republic
[2] SpeechTech Sro, Plzen, Czech Republic
[3] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD USA
来源
SPEECH AND COMPUTER (SPECOM 2018) | 2018年 / 11096卷
关键词
Semi-supervised training; Data selection; Acoustic modelling; ATC speech recognition;
D O I
10.1007/978-3-319-99579-3_66
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we describe a semi-supervised training method used to generalize the Air Traffic Control (ATC) speech recognizer. The paper introduces the problems and challenges in ATC English recognition, describes available datasets and ongoing research projects. The baseline recognition model is then used to recognize the unlabelled data from a publicly available source. We used the LiveATC community portal which records and archives the recordings of ATC communication near the airports. The recognized unlabelled data are filtered using the data selection procedure based on confidence scores and the recognition acoustic model is retrained to obtain a more general model. The results on accented Czech and French data are reported.
引用
收藏
页码:646 / 655
页数:10
相关论文
共 50 条
  • [1] SEMI-SUPERVISED ENSEMBLE DNN ACOUSTIC MODEL TRAINING
    Li, Sheng
    Lu, Xugang
    Sakai, Shinsuke
    Mimura, Masato
    Kawahara, Tatsuya
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5270 - 5274
  • [2] Semi-Supervised Speech Recognition Acoustic Model Training Using Policy Gradient
    Chung, Hoon
    Lee, Sung Joo
    Jeon, Hyeong Bae
    Park, Jeon Gue
    APPLIED SCIENCES-BASEL, 2020, 10 (10):
  • [3] Semi-supervised acoustic model training for speech with code-switching
    Yilmaz, Emre
    McLaren, Mitchell
    van den Heuvel, Henk
    van Leeuwen, David A.
    SPEECH COMMUNICATION, 2018, 105 : 12 - 22
  • [4] Lithuanian Broadcast Speech Transcription using Semi-supervised Acoustic Model Training
    Lileikyte, Rasa
    Gorin, Arseniy
    Lamel, Lori
    Gauvain, Jean-Luc
    Fraga-Silva, Thiago
    SLTU-2016 5TH WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGIES FOR UNDER-RESOURCED LANGUAGES, 2016, 81 : 107 - 113
  • [5] Semi-supervised and unsupervised discriminative language model training for automatic speech recognition
    Dikici, Erinc
    Saraclar, Murat
    SPEECH COMMUNICATION, 2016, 83 : 54 - 63
  • [6] Exploiting Eigenposteriors for Semi-supervised Training of DNN Acoustic Models with Sequence Discrimination
    Dighe, Pranay
    Asaei, Afsaneh
    Bourlard, Nerve
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3552 - 3556
  • [7] Semi-supervised DNN training with word selection for ASR
    Vesely, Karel
    Burget, Lukas
    Cernocky, Jan Honza
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3687 - 3691
  • [8] LANGUAGE DIARIZATION FOR SEMI-SUPERVISED BILINGUAL ACOUSTIC MODEL TRAINING
    Yilmaz, Emre
    McLaren, Mitchell
    van den Heuvel, Henk
    van Leeuwen, David A.
    2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 91 - 96
  • [9] SEMI-SUPERVISED AND POPULATION BASED TRAINING FOR VOICE COMMANDS RECOGNITION
    Elibol, Oguz H.
    Keskin, Gokce
    Thomas, Anil
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6371 - 6375
  • [10] Investigation of Semi-supervised Acoustic Model Training based on the Committee of Heterogeneous Neural Networks
    Kanda, Naoyuki
    Harada, Shoji
    Lu, Xugang
    Kawai, Hisashi
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1325 - 1329