Low-Resource Speech Recognition and Keyword-Spotting

被引:5
作者
Gales, Mark J. F. [1 ]
Knill, Kate M. [1 ]
Ragni, Anton [1 ]
机构
[1] Univ Cambridge, Dept Engn, Trumpington St, Cambridge, England
来源
SPEECH AND COMPUTER, SPECOM 2017 | 2017年 / 10458卷
关键词
Prosody perception; Narrow versus broad focus; Japanese learners of English; L2; acquisition; DEEP NEURAL-NETWORK; DATA AUGMENTATION;
D O I
10.1007/978-3-319-66429-3_1
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The IARPA Babel program ran from March 2012 to November 2016. The aim of the program was to develop agile and robust speech technology that can be rapidly applied to any human language in order to provide effective search capability on large quantities of real world data. This paper will describe some of the developments in speech recognition and keyword-spotting during the lifetime of the project. Two technical areas will be briefly discussed with a focus on techniques developed at Cambridge University: the application of deep learning for low-resource speech recognition; and efficient approaches for keyword spotting. Finally a brief analysis of the Babel speech language characteristics and language performance will be presented.
引用
收藏
页码:3 / 19
页数:17
相关论文
共 50 条
  • [41] Enhancement of Named Entity Recognition in Low-Resource Languages with Data Augmentation and BERT Models: A Case Study on Urdu
    Ullah, Fida
    Gelbukh, Alexander
    Zamir, Muhammad Tayyab
    Riveron, Edgardo Manuel Felipe
    Sidorov, Grigori
    COMPUTERS, 2024, 13 (10)
  • [42] Text-to-speech system for low-resource language using cross-lingual transfer learning and data augmentation
    Zolzaya Byambadorj
    Ryota Nishimura
    Altangerel Ayush
    Kengo Ohta
    Norihide Kitaoka
    EURASIP Journal on Audio, Speech, and Music Processing, 2021
  • [43] Text-to-speech system for low-resource language using cross-lingual transfer learning and data augmentation
    Byambadorj, Zolzaya
    Nishimura, Ryota
    Ayush, Altangerel
    Ohta, Kengo
    Kitaoka, Norihide
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)
  • [44] KL-divergence Regularized Deep Neural Network Adaptation for Low-resource Speaker-dependent Speech Enhancement
    Chai, Li
    Du, Jun
    Lee, Chin-Hui
    INTERSPEECH 2019, 2019, : 1806 - 1810
  • [45] 3Rs:Data Augmentation Techniques Using Document Contexts For Low-Resource Chinese Named Entity Recognition
    Ying, Zheyu
    Zhang, Jinglei
    Xie, Rui
    Wen, Guochang
    Xiao, Feng
    Liu, Xueyang
    Zhang, Shikun
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [46] Low-resource synchronous coincidence processor for positron emission tomography
    Sportelli, Giancarlo
    Belcari, Nicola
    Guerra, Pedro
    Santos, Andres
    NUCLEAR INSTRUMENTS & METHODS IN PHYSICS RESEARCH SECTION A-ACCELERATORS SPECTROMETERS DETECTORS AND ASSOCIATED EQUIPMENT, 2011, 648 : S199 - S201
  • [47] Low-Resource Aspect-Based Sentiment Analysis: A Survey
    Chen Z.
    Qian T.-Y.
    Li W.-L.
    Zhang T.
    Zhou S.
    Zhong M.
    Zhu Y.-Y.
    Liu M.-C.
    Jisuanji Xuebao/Chinese Journal of Computers, 2023, 46 (07): : 1445 - 1472
  • [48] Robust and Informative Text Augmentation (RITA) via Constrained Worst-Case Transformations for Low-Resource Named Entity Recognition
    Sohn, Hyunwoo
    Park, Baekkwan
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 1616 - 1624
  • [49] Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation
    Comini, Giulia
    Huybrechts, Goeric
    Ribeiro, Manuel Sam
    Gabrys, Adam
    Lorenzo-Trueba, Jaime
    INTERSPEECH 2022, 2022, : 1946 - 1950
  • [50] Enhancing Low-Resource NLP by Consistency Training With Data and Model Perturbations
    Liang, Xiaobo
    Mao, Runze
    Wu, Lijun
    Li, Juntao
    Zhang, Min
    Li, Qing
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 189 - 199