Air traffic control speech recognition system cross-task & speaker adaptation

被引:11
|
作者
de Cordoba, R. [1 ]
Ferreiros, J. [1 ]
San-Segundo, R. [1 ]
Macias-Guarasa, J. [1 ]
Montero, J. M. [1 ]
Fernandez, F. [1 ]
D'Haro, L. F. [1 ]
Pardo, J. M. [1 ]
机构
[1] Univ Politecn Madrid, Speech Technol Grp, Dept Elect Engn, ETSI Telecomunicat, E-28040 Madrid, Spain
关键词
D O I
10.1109/MAES.2006.1705165
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
We present an overview of the most common techniques used in automatic speech recognition to adapt a general system to a different environment (known as cross-task adaptation) such as in an air traffic control system (ATC). The conditions present in ATC are very specific: very spontaneous, the presence of noise, and high speed speech. So, with a typical speech recognizer the recognition results are unsatisfactory. We have to decide on the best option for the modeling: to develop acoustic models specific to those conditions from scratch using the data available for the new environment, or to carry out cross-task adaptation starting from reliable MUM models (usually requiring less data in the target domain). We begin with a description of the main techniques considered for cross-task adaptation, namely Maximum A Posteriori (MAP), Maximum Likelihood Linear Regression (MLLR), and the two together. We have applied each in two speech recognizers for air traffic. control tasks, one for spontaneous speech and the other for a command interface. We show the performance of these techniques and compare them with the development of a new system from scratch. We also show the results obtained for speaker adaptation using a variable amount of adaptation data. The main conclusion is that MLLR can outperform MAP when a large number of transforms is used, and MLLR followed by MAP is the best option. All of these techniques are better than developing a new system from scratch, showing the effectiveness of mean and variance adaptation.
引用
收藏
页码:12 / 17
页数:6
相关论文
共 50 条
  • [1] New Advances in Cross-Task and Speaker Adaptation for Air Traffic Control Tasks
    Cordoba, Ricardo
    Macias-Guarasa, Javier
    Sama, Valentin
    Barra, Roberto
    Manuel Pardo, Jose
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2005, (35): : 21 - 27
  • [2] Cross-task portability of a broadcast news speech recognition system
    Bertoldi, N
    Brugnara, F
    Cettolo, M
    Federico, M
    Giuliani, D
    SPEECH COMMUNICATION, 2002, 38 (3-4) : 335 - 347
  • [3] Cross-task cue utilisation and situational awareness in simulated air traffic control
    Falkland, Emma C.
    Wiggins, Mark W.
    APPLIED ERGONOMICS, 2019, 74 : 24 - 30
  • [4] SPEAKER ADAPTATION IN A LIMITED SPEECH RECOGNITION SYSTEM
    MAKHOUL, J
    IEEE TRANSACTIONS ON COMPUTERS, 1971, C 20 (09) : 1057 - &
  • [5] Cross-Task Cognitive Workload Recognition Based on EEG and Domain Adaptation
    Zhou, Yueying
    Xu, Ziming
    Niu, Yifan
    Wang, Pengpai
    Wen, Xuyun
    Wu, Xia
    Zhang, Daoqiang
    IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2022, 30 : 50 - 60
  • [6] Speaker adaptation by modeling the speaker variation in a continuous speech recognition system
    Strom, N
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 989 - 992
  • [7] Speech Recognition Using Speaker Adaptation by System Parameter Transformation
    Hao, Ying
    Fang, Ditang
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (01): : 63 - 68
  • [8] PREDICTIVE SPEAKER ADAPTATION IN SPEECH RECOGNITION
    COX, S
    COMPUTER SPEECH AND LANGUAGE, 1995, 9 (01): : 1 - 17
  • [9] Automatic Speech Recognition for Air Traffic Control Communications
    Badrinath, Sandeep
    Balakrishnan, Hamsa
    TRANSPORTATION RESEARCH RECORD, 2022, 2676 (01) : 798 - 810
  • [10] Modelling of a Speech-to-Text Recognition System for Air Traffic Control and NATO Air Command
    Zietsman, Grant
    Malekian, Reza
    JOURNAL OF INTERNET TECHNOLOGY, 2022, 23 (07): : 1527 - 1539