Semi-supervised Learning with Semantic Knowledge Extraction for Improved Speech Recognition in Air Traffic Control

被引:25
|
作者
Srinivasamurthy, Ajay [1 ]
Motlicek, Petr [1 ]
Himawan, Ivan [1 ]
Szaszak, Gyoergy [2 ]
Oualil, Youssef [2 ]
Helmke, Hartmut [3 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
[2] Saarland Univ UdS, Spoken Language Syst Grp, Saarbrucken, Germany
[3] German Aerosp Ctr DLR, Inst Flight Guidance, Braunschweig, Germany
来源
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年
基金
欧盟地平线“2020”;
关键词
Speech Recognition; Air Traffic Control; Semi supervised learning;
D O I
10.21437/Interspeech.2017-1446
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic Speech Recognition (ASR) can introduce higher levels of automation into Air Traffic Control (ATC), where spoken language is still the predominant form of communication. While ATC uses standard phraseology and a limited vocabulary, we need to adapt the speech recognition systems to local acoustic conditions and vocabularies at each airport to reach optimal performance. Due to continuous operation of ATC systems, a large and increasing amount of untranscribed speech data is available, allowing for semi-supervised learning methods to build and adapt ASR models. In this paper, we first identify the challenges in building ASR systems for specific ATC areas and propose to utilize out-of-domain data to build baseline ASR models. Then we explore different methods of data selection for adapting baseline models by exploiting the continuously increasing untranscribed data. We develop a basic approach capable of exploiting semantic representations of ATC commands. We achieve relative improvement in both word error rate (23.5%) and concept error rates (7%) when adapting ASR models to different ATC conditions in a semi-supervised manner.
引用
收藏
页码:2406 / 2410
页数:5
相关论文
共 50 条
  • [1] Semantic Relation Extraction Based on Semi-supervised Learning
    Li, Haibo
    Matsuo, Yutaka
    Ishizuka, Mitsuru
    INFORMATION RETRIEVAL TECHNOLOGY, 2010, 6458 : 270 - 279
  • [2] USING COLLECTIVE INFORMATION IN SEMI-SUPERVISED LEARNING FOR SPEECH RECOGNITION
    Varadarajan, Balakrishnan
    Yu, Dong
    Deng, Li
    Acero, Alex
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4633 - +
  • [3] Regularized Urdu Speech Recognition with Semi-Supervised Deep Learning
    Humayun, Mohammad Ali
    Hameed, Ibrahim A.
    Shah, Syed Muslim
    Khan, Sohaib Hassan
    Zafar, Irfan
    Bin Ahmed, Saad
    Shuja, Junaid
    APPLIED SCIENCES-BASEL, 2019, 9 (09):
  • [4] A SEMI-SUPERVISED LEARNING METHOD FOR AIR TRAFFIC COMPLEXITY EVALUATION
    Zhu, Xi
    Cai, Kaiquan
    Cao, Xianbin
    2017 INTEGRATED COMMUNICATIONS, NAVIGATION AND SURVEILLANCE CONFERENCE (ICNS), 2017,
  • [5] Combining cross-modal knowledge transfer and semi-supervised learning for speech emotion recognition
    Zhang, Sheng
    Chen, Min
    Chen, Jincai
    Li, Yuan-Fang
    Wu, Yiling
    Li, Minglei
    Zhu, Chuanbo
    KNOWLEDGE-BASED SYSTEMS, 2021, 229
  • [6] Semi-Supervised Learning of Speech Sounds
    Jansen, Aren
    Niyogi, Partha
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2264 - 2267
  • [7] Active Learning for Improved Semi-Supervised Semantic Segmentation in Satellite Images
    Desai, Shasvat
    Ghose, Debasmita
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 1485 - 1495
  • [8] Confidence Measures in Speech Emotion Recognition Based on Semi-supervised Learning
    Deng, Jun
    Schuller, Bjoern
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2223 - 2226
  • [9] Speech Emotion Recognition Using Semi-supervised Learning with Ladder Networks
    Huang, Jian
    Li, Ya
    Tao, Jianhua
    Lian, Zheng
    Niu, Mingyue
    Yi, Jiangyan
    2018 FIRST ASIAN CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII ASIA), 2018,
  • [10] INCREMENTAL SEMI-SUPERVISED LEARNING FOR MULTI-GENRE SPEECH RECOGNITION
    Khonglah, Banriskhem
    Madikeri, Srikanth
    Dey, Subhadeep
    Bourlard, Herve
    Motlicek, Petr
    Billa, Jayadev
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7419 - 7423