Low-Resource Speech Recognition and Keyword-Spotting

被引:5
|
作者
Gales, Mark J. F. [1 ]
Knill, Kate M. [1 ]
Ragni, Anton [1 ]
机构
[1] Univ Cambridge, Dept Engn, Trumpington St, Cambridge, England
来源
SPEECH AND COMPUTER, SPECOM 2017 | 2017年 / 10458卷
关键词
Prosody perception; Narrow versus broad focus; Japanese learners of English; L2; acquisition; DEEP NEURAL-NETWORK; DATA AUGMENTATION;
D O I
10.1007/978-3-319-66429-3_1
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The IARPA Babel program ran from March 2012 to November 2016. The aim of the program was to develop agile and robust speech technology that can be rapidly applied to any human language in order to provide effective search capability on large quantities of real world data. This paper will describe some of the developments in speech recognition and keyword-spotting during the lifetime of the project. Two technical areas will be briefly discussed with a focus on techniques developed at Cambridge University: the application of deep learning for low-resource speech recognition; and efficient approaches for keyword spotting. Finally a brief analysis of the Babel speech language characteristics and language performance will be presented.
引用
收藏
页码:3 / 19
页数:17
相关论文
共 50 条
  • [1] Combining Tandem and Hybrid Systems for Improved Speech Recognition and Keyword Spotting on Low Resource Languages
    Rath, Shakti P.
    Knill, Kate M.
    Ragni, Anton
    Gales, Mark J. E.
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 835 - 839
  • [2] Frontier Research on Low-Resource Speech Recognition Technology
    Slam, Wushour
    Li, Yanan
    Urouvas, Nurmamet
    SENSORS, 2023, 23 (22)
  • [3] Optimizing Data Usage for Low-Resource Speech Recognition
    Qian, Yanmin
    Zhou, Zhikai
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 394 - 403
  • [4] Low-resource Sinhala Speech Recognition using Deep Learning
    Karunathilaka, Hirunika
    Welgama, Viraj
    Nadungodage, Thilini
    Weerasinghe, Ruvan
    2020 20TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER-2020), 2020, : 196 - 201
  • [5] MIXSPEECH: DATA AUGMENTATION FOR LOW-RESOURCE AUTOMATIC SPEECH RECOGNITION
    Meng, Linghui
    Xu, Jin
    Tan, Xu
    Wang, Jindong
    Qin, Tao
    Xu, Bo
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7008 - 7012
  • [6] MixRep: Hidden Representation Mixup for Low-Resource Speech Recognition
    Xie, Jiamin
    Hansen, John H. L.
    INTERSPEECH 2023, 2023, : 1304 - 1308
  • [7] EXPLORING EFFECTIVE DATA UTILIZATION FOR LOW-RESOURCE SPEECH RECOGNITION
    Zhou, Zhikai
    Wang, Wei
    Zhang, Wangyou
    Qian, Yanmin
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8192 - 8196
  • [8] ANALYSIS OF X-VECTORS FOR LOW-RESOURCE SPEECH RECOGNITION
    Karafiat, Martin
    Vesely, Karel
    Cernocky, Jan Honza
    Profant, Jan
    Nytra, Jiri
    Hlavacek, Miroslav
    Pavlicek, Tomas
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6998 - 7002
  • [9] A General Procedure for Improving Language Models in Low-Resource Speech Recognition
    Liu, Qian
    Zhang, Wei-Qiang
    Liu, Jia
    Liu, Yao
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2019, : 428 - 433
  • [10] Zero Shot Text to Speech Augmentation for Automatic Speech Recognition on Low-Resource Accented Speech Corpora
    Nespoli, Francesco
    Barreda, Daniel
    Naylor, Patrick A.
    FIFTY-SEVENTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, IEEECONF, 2023, : 1080 - 1084