INVESTIGATING TECHNIQUES FOR LOW RESOURCE CONVERSATIONAL SPEECH RECOGNITION

被引:0
作者
Laurent, Antoine [1 ]
Fraga-Silva, Thiago [1 ]
Lamel, Lori [2 ]
Gauvain, Jean-Luc [2 ]
机构
[1] Vocapia Res, 28 Rue Jean Rostand, F-91400 Orsay, France
[2] CNRS LIMSI, Spoken Language Proc Grp, F-91405 Orsay, France
来源
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS | 2016年
关键词
low-ressource languages; speech recognition; keyword spotting; conversational speech; LANGUAGE;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper we investigate various techniques in order to build effective speech to text (STT) and keyword search (KWS) systems for low resource conversational speech. Subword decoding and graphemic mappings were assessed in order to detect out-of-vocabulary keywords. To deal with the limited amount of transcribed data, semi-supervised training and data selection methods were investigated. Robust acoustic features produced via data augmentation were evaluated for acoustic modeling. For language modeling, automatically retrieved conversational-like Webdata was used, as well as neural network based models. We report STT improvements with all the techniques, but interestingly only some improve KWS performance. Results are reported for the Swahili language in the context of the 2015 OpenKWS Evaluation.
引用
收藏
页码:5975 / 5979
页数:5
相关论文
共 22 条
[1]  
[Anonymous], 2007, P ACM SIGIR C
[2]  
[Anonymous], 2014, SLTU
[3]   Automatic speech recognition for under-resourced languages: A survey [J].
Besacier, Laurent ;
Barnard, Etienne ;
Karpov, Alexey ;
Schultz, Tanja .
SPEECH COMMUNICATION, 2014, 56 :85-100
[4]  
Chen G., 2013, IEEE ASRU
[5]  
Fraga-Silva T., 2015, IEEE ASRU
[6]  
Fraga-Silva T., 2015, ISCA INTERSPEECH
[7]   The LIMSI Broadcast News transcription system [J].
Gauvain, JL ;
Lamel, L ;
Adda, G .
SPEECH COMMUNICATION, 2002, 37 (1-2) :89-108
[8]  
Gelly G., 2015, ISCA INTERSPEECH
[9]  
Grézl F, 2013, 2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), P470, DOI 10.1109/ASRU.2013.6707775
[10]  
Hartmann W., 2014, ISCA INTERSPEECH