Semi-supervised Learning with Semantic Knowledge Extraction for Improved Speech Recognition in Air Traffic Control

被引:26
作者
Srinivasamurthy, Ajay [1 ]
Motlicek, Petr [1 ]
Himawan, Ivan [1 ]
Szaszak, Gyoergy [2 ]
Oualil, Youssef [2 ]
Helmke, Hartmut [3 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
[2] Saarland Univ UdS, Spoken Language Syst Grp, Saarbrucken, Germany
[3] German Aerosp Ctr DLR, Inst Flight Guidance, Braunschweig, Germany
来源
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年
基金
欧盟地平线“2020”;
关键词
Speech Recognition; Air Traffic Control; Semi supervised learning;
D O I
10.21437/Interspeech.2017-1446
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic Speech Recognition (ASR) can introduce higher levels of automation into Air Traffic Control (ATC), where spoken language is still the predominant form of communication. While ATC uses standard phraseology and a limited vocabulary, we need to adapt the speech recognition systems to local acoustic conditions and vocabularies at each airport to reach optimal performance. Due to continuous operation of ATC systems, a large and increasing amount of untranscribed speech data is available, allowing for semi-supervised learning methods to build and adapt ASR models. In this paper, we first identify the challenges in building ASR systems for specific ATC areas and propose to utilize out-of-domain data to build baseline ASR models. Then we explore different methods of data selection for adapting baseline models by exploiting the continuously increasing untranscribed data. We develop a basic approach capable of exploiting semantic representations of ATC commands. We achieve relative improvement in both word error rate (23.5%) and concept error rates (7%) when adapting ASR models to different ATC conditions in a semi-supervised manner.
引用
收藏
页码:2406 / 2410
页数:5
相关论文
共 50 条
  • [21] Bootstrapping of Semantic Relation Extraction for a Morphologically Rich Language: Semi-Supervised Learning of Semantic Relations
    Jagan, Balaji
    Parthasarathi, Ranjani
    Geetha, T., V
    INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2019, 15 (01) : 119 - 149
  • [22] SPEECH EMOTION RECOGNITION USING SEMI-SUPERVISED LEARNING WITH EFFICIENT LABELING STRATEGIES
    Zhu, Zhi
    Sato, Yoshinao
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 358 - 365
  • [23] BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
    Zhang, Yu
    Park, Daniel S.
    Han, Wei
    Qin, James
    Gulati, Anmol
    Shor, Joel
    Jansen, Aren
    Xu, Yuanzhong
    Huang, Yanping
    Wang, Shibo
    Zhou, Zongwei
    Li, Bo
    Ma, Min
    Chan, William
    Yu, Jiahui
    Wang, Yongqiang
    Cao, Liangliang
    Sim, Khe Chai
    Ramabhadran, Bhuvana
    Sainath, Tara N.
    Beaufays, Francoise
    Chen, Zhifeng
    Le, Quoc, V
    Chiu, Chung-Cheng
    Pang, Ruoming
    Wu, Yonghui
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2022, 16 (06) : 1519 - 1532
  • [24] Colour Augmentation for Improved Semi-supervised Semantic Segmentation
    French, Geoff
    Mackiewicz, Michal
    PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 4, 2022, : 356 - 363
  • [25] Semi-supervised Ladder Networks for Speech Emotion Recognition
    Jian-Hua Tao
    Jian Huang
    Ya Li
    Zheng Lian
    Ming-Yue Niu
    International Journal of Automation and Computing, 2019, 16 : 437 - 448
  • [26] Semi-Supervised Speech Emotion Recognition With Ladder Networks
    Parthasarathy, Srinivas
    Busso, Carlos
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2697 - 2709
  • [27] Semi-supervised Ladder Networks for Speech Emotion Recognition
    Tao, Jian-Hua
    Huang, Jian
    Li, Ya
    Lian, Zheng
    Niu, Ming-Yue
    INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING, 2019, 16 (04) : 437 - 448
  • [28] Semi-supervised Ladder Networks for Speech Emotion Recognition
    Jian-Hua Tao
    Jian Huang
    Ya Li
    Zheng Lian
    Ming-Yue Niu
    International Journal of Automation and Computing, 2019, 16 (04) : 437 - 448
  • [29] A novel semi-supervised learning for face recognition
    Gao, Quanxue
    Huang, Yunfang
    Gao, Xinbo
    Shen, Weiguo
    Zhang, Hailin
    NEUROCOMPUTING, 2015, 152 : 69 - 76
  • [30] SEMI-SUPERVISED LEARNING FOR MUSICAL INSTRUMENT RECOGNITION
    Diment, Aleksandr
    Heittola, Toni
    Virtanen, Tuomas
    2013 PROCEEDINGS OF THE 21ST EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2013,