SEMI-SUPERVISED TRAINING IN LOW-RESOURCE ASR AND KWS

被引:0
|
作者
Metze, Florian [1 ,2 ]
Gandhe, Ankur [1 ,2 ]
Miao, Yajie [1 ,2 ]
Sheikh, Zaid [1 ,2 ]
Wang, Yun [1 ,2 ]
Xu, Di [1 ,2 ]
Zhang, Hao [1 ,2 ]
Kim, Jungsuk [3 ,4 ]
Lane, Ian [3 ,4 ]
Lee, Won Kyum [3 ,4 ]
Stueker, Sebastian [5 ]
Mueller, Markus [5 ]
机构
[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
[2] Carnegie Mellon Univ, Language Technol Inst, Moffett Field, CA USA
[3] Carnegie Mellon Univ, Dept Elect & Comp Engn, Pittsburgh, PA 15213 USA
[4] Carnegie Mellon Univ, Dept Elect & Comp Engn, Moffett Field, CA USA
[5] Karlsruhe Inst Technol, Karlsruhe, Germany
来源
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP) | 2015年
基金
美国国家科学基金会;
关键词
spoken term detection; automatic speech recognition; low-resource LTs; semi-supervised training; RECOGNITION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In particular for "low resource" Keyword Search (KWS) and Speech-to-Text (STT) tasks, more untranscribed test data may be available than training data. Several approaches have been proposed to make this data useful during system development, even when initial systems have Word Error Rates (WER) above 70%. In this paper, we present a set of experiments on low-resource languages in telephony speech quality in Assamese, Bengali, Lao, Haitian, Zulu, and Tamil, demonstrating the impact that such techniques can have, in particular learning robust bottle-neck features on the test data. In the case of Tamil, when significantly more test data than training data is available, we integrated semi-supervised training and speaker adaptation on the test data, and achieved significant additional improvements in STT and KWS.
引用
收藏
页码:4699 / 4703
页数:5
相关论文
共 50 条
  • [41] Semi-supervised and unsupervised discriminative language model training for automatic speech recognition
    Dikici, Erinc
    Saraclar, Murat
    SPEECH COMMUNICATION, 2016, 83 : 54 - 63
  • [42] Semi-supervised Development of ASR Systems for Multilingual Code-switched Speech in Under-resourced Languages
    Biswas, Astik
    Yilmaz, Emre
    de Wet, Febe
    Van der Westhuizen, Ewald
    Niesler, Thomas
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 3468 - 3474
  • [43] CO-ADAPTATION: ADAPTIVE CO-TRAINING FOR SEMI-SUPERVISED LEARNING
    Tur, Gokhan
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3721 - 3724
  • [44] On Semi-Supervised LF-MMI Training of Acoustic Models with Limited Data
    Sheikh, Imran
    Vincent, Emmanuel
    Illina, Irina
    INTERSPEECH 2020, 2020, : 986 - 990
  • [45] A Novel Semi-Supervised Electronic Nose Learning Technique: M-Training
    Jia, Pengfei
    Huang, Tailai
    Duan, Shukai
    Ge, Lingpu
    Yan, Jia
    Wang, Lidan
    SENSORS, 2016, 16 (03):
  • [46] Contextual Semi-Supervised Learning: An Approach To Leverage Air-Surveillance and Untranscribed ATC Data in ASR Systems
    Zuluaga-Gomez, Juan
    Nigmatulina, Iuliia
    Prasad, Amrutha
    Motlicek, Petr
    Vesely, Karel
    Kocour, Martin
    Szoke, Igor
    INTERSPEECH 2021, 2021, : 3296 - 3300
  • [47] Supervised neighborhood graph construction for semi-supervised classification
    Rohban, Mohammad Hossein
    Rabiee, Hamid R.
    PATTERN RECOGNITION, 2012, 45 (04) : 1363 - 1372
  • [48] Comparing Self-Supervised Pre-Training and Semi-Supervised Training for Speech Recognition in Languages with Weak Language Models
    Lam-Yee-Mui, Lea-Marie
    Yang, Lucas Ondel
    Klejch, Ondrej
    INTERSPEECH 2023, 2023, : 87 - 91
  • [49] Semi-supervised clustering of unknown expressions
    Jalal, Ahsan
    Tariq, Usman
    PATTERN RECOGNITION LETTERS, 2019, 120 : 46 - 53
  • [50] FISHERVOICE AND SEMI-SUPERVISED SPEAKER CLUSTERING
    Chu, Stephen M.
    Tang, Hao
    Huang, Thomas S.
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4089 - +