Low-Resource Speech Recognition and Keyword-Spotting

被引:5
|
作者
Gales, Mark J. F. [1 ]
Knill, Kate M. [1 ]
Ragni, Anton [1 ]
机构
[1] Univ Cambridge, Dept Engn, Trumpington St, Cambridge, England
来源
SPEECH AND COMPUTER, SPECOM 2017 | 2017年 / 10458卷
关键词
Prosody perception; Narrow versus broad focus; Japanese learners of English; L2; acquisition; DEEP NEURAL-NETWORK; DATA AUGMENTATION;
D O I
10.1007/978-3-319-66429-3_1
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The IARPA Babel program ran from March 2012 to November 2016. The aim of the program was to develop agile and robust speech technology that can be rapidly applied to any human language in order to provide effective search capability on large quantities of real world data. This paper will describe some of the developments in speech recognition and keyword-spotting during the lifetime of the project. Two technical areas will be briefly discussed with a focus on techniques developed at Cambridge University: the application of deep learning for low-resource speech recognition; and efficient approaches for keyword spotting. Finally a brief analysis of the Babel speech language characteristics and language performance will be presented.
引用
收藏
页码:3 / 19
页数:17
相关论文
共 50 条
  • [21] External Text Based Data Augmentation for Low-Resource Speech Recognition in the Constrained Condition of OpenASR21 Challenge
    Zhong, Guolong
    Song, Hongyu
    Wang, Ruoyu
    Sun, Lei
    Liu, Diyuan
    Pan, Jia
    Fang, Xin
    Du, Jun
    Zhang, Jie
    Dai, Lirong
    INTERSPEECH 2022, 2022, : 4860 - 4864
  • [22] Deep Neural Network based Feature Extraction Using Convex-nonnegative Matrix Factorization for Low-resource Speech Recognition
    Qin, Chuxiong
    Zhang, Lianhai
    2016 IEEE INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC), 2016, : 1082 - 1086
  • [23] Low-Resource Named Entity Recognition via the Pre-Training Model
    Chen, Siqi
    Pei, Yijie
    Ke, Zunwang
    Silamu, Wushour
    SYMMETRY-BASEL, 2021, 13 (05):
  • [24] Exogenous and Endogenous Data Augmentation for Low-Resource Complex Named Entity Recognition
    Zhang, Xinghua
    Chen, Gaode
    Cui, Shiyao
    Sheng, Jiawei
    Liu, Tingwen
    Xu, Hongbo
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 630 - 640
  • [25] Image-Mediated Data Augmentation for Low-Resource Human Activity Recognition
    Wang, Zihao
    Qu, Youli
    Tao, Junru
    Song, Yudan
    PROCEEDINGS OF THE 2019 THE 3RD INTERNATIONAL CONFERENCE ON COMPUTE AND DATA ANALYSIS (ICCDA 2019), 2019, : 49 - 54
  • [26] Improving the performance of keyword spotting system for children's speech through prosody modification
    Shahnawazuddin, S.
    Maity, Karabi
    Pradhan, Gayadhar
    DIGITAL SIGNAL PROCESSING, 2019, 86 : 11 - 18
  • [27] A Study on Low-resource Language Identification
    Qi, Zhaodi
    Ma, Yong
    Gu, Mingliang
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1897 - 1902
  • [28] Improving Low Resource Turkish Speech Recognition with Data Augmentation and TTS
    Gokay, Ramazan
    Yalcin, Hulya
    2019 16TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2019, : 357 - 360
  • [29] Investigations of Low Resource Multi-Accent Mandarin Speech Recognition
    Wang, Wei
    Xu, Wenying
    Sui, Xiang
    Wang, Lan
    Liu, Xunying
    2015 IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, 2015, : 62 - 66
  • [30] Semi-supervised Named Entity Recognition for Low-Resource Languages Using Dual PLMs
    Yohannes, Hailemariam Mehari
    Lynden, Steven
    Amagasa, Toshiyuki
    Matono, Akiyoshi
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PT I, NLDB 2024, 2024, 14762 : 166 - 180