Semi-supervised Learning with Semantic Knowledge Extraction for Improved Speech Recognition in Air Traffic Control

被引:26
作者
Srinivasamurthy, Ajay [1 ]
Motlicek, Petr [1 ]
Himawan, Ivan [1 ]
Szaszak, Gyoergy [2 ]
Oualil, Youssef [2 ]
Helmke, Hartmut [3 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
[2] Saarland Univ UdS, Spoken Language Syst Grp, Saarbrucken, Germany
[3] German Aerosp Ctr DLR, Inst Flight Guidance, Braunschweig, Germany
来源
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年
基金
欧盟地平线“2020”;
关键词
Speech Recognition; Air Traffic Control; Semi supervised learning;
D O I
10.21437/Interspeech.2017-1446
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic Speech Recognition (ASR) can introduce higher levels of automation into Air Traffic Control (ATC), where spoken language is still the predominant form of communication. While ATC uses standard phraseology and a limited vocabulary, we need to adapt the speech recognition systems to local acoustic conditions and vocabularies at each airport to reach optimal performance. Due to continuous operation of ATC systems, a large and increasing amount of untranscribed speech data is available, allowing for semi-supervised learning methods to build and adapt ASR models. In this paper, we first identify the challenges in building ASR systems for specific ATC areas and propose to utilize out-of-domain data to build baseline ASR models. Then we explore different methods of data selection for adapting baseline models by exploiting the continuously increasing untranscribed data. We develop a basic approach capable of exploiting semantic representations of ATC commands. We achieve relative improvement in both word error rate (23.5%) and concept error rates (7%) when adapting ASR models to different ATC conditions in a semi-supervised manner.
引用
收藏
页码:2406 / 2410
页数:5
相关论文
共 50 条
[31]   Semi-supervised learning for tongue constitution recognition [J].
Ma, Yichao ;
Wu, Chunhong ;
Li, Tian .
FOURTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING, ICGIP 2022, 2022, 12705
[32]   Semi-supervised Learning for Segmentation Under Semantic Constraint [J].
Ganaye, Pierre-Antoine ;
Sdika, Michael ;
Benoit-Cattin, Hugues .
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, PT III, 2018, 11072 :595-602
[33]   Semantic augmentation by mixing contents for semi-supervised learning [J].
Sun, Remy ;
Masson, Clement ;
Henaff, Gilles ;
Thome, Nicolas ;
Cord, Matthieu .
PATTERN RECOGNITION, 2024, 145
[34]   Semi-supervised Learning Methods for Semantic Segmentation of Polyps [J].
Ines, Adrian ;
Dominguez, Cesar ;
Heras, Jonathan ;
Mata, Eloy ;
Pascual, Vico .
ADVANCES IN ARTIFICIAL INTELLIGENCE, CAEPIA 2024, 2024, :162-172
[35]   Improving Aphasic Speech Recognition by Using Novel Semi-Supervised Learning Methods on AphasiaBank for English and Spanish [J].
Torre, Ivan G. ;
Romero, Monica ;
Alvarez, Aitor .
APPLIED SCIENCES-BASEL, 2021, 11 (19)
[36]   Semi-supervised label enhancement via structured semantic extraction [J].
Wen, Tao ;
Li, Weiwei ;
Chen, Lei ;
Jia, Xiuyi .
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2022, 13 (04) :1131-1144
[37]   Privacy-preserving Speech Emotion Recognition through Semi-Supervised Federated Learning [J].
Tsouvalas, Vasileios ;
Ozcelebi, Tanir ;
Meratnia, Nirvana .
2022 IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATIONS WORKSHOPS AND OTHER AFFILIATED EVENTS (PERCOM WORKSHOPS), 2022,
[38]   Semi-supervised label enhancement via structured semantic extraction [J].
Tao Wen ;
Weiwei Li ;
Lei Chen ;
Xiuyi Jia .
International Journal of Machine Learning and Cybernetics, 2022, 13 :1131-1144
[39]   Semi-supervised Training for Sequence-to-Sequence Speech Recognition Using Reinforcement Learning [J].
Chung, Hoon ;
Jeon, Hyeong-Bae ;
Park, Jeon Gue .
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[40]   Sequential visual and semantic consistency for semi-supervised text recognition [J].
Yang, Mingkun ;
Yang, Biao ;
Liao, Minghui ;
Zhu, Yingying ;
Bai, Xiang .
PATTERN RECOGNITION LETTERS, 2024, 178 :174-180