OpenASR20: An Open Challenge for Automatic Speech Recognition of Conversational Telephone Speech in Low-Resource Languages

被引:6
作者
Peterson, Kay [1 ]
Tong, Audrey [1 ]
Yu, Yan [2 ]
机构
[1] NIST, Gaithersburg, MD 20899 USA
[2] Dakota Consulting Inc, Silver Spring, MD USA
来源
INTERSPEECH 2021 | 2021年
关键词
automatic speech recognition; evaluation; low-resource language; conversational telephone speech; IARPA MATERIAL; Amharic; Cantonese; Guarani; !text type='Java']Java[!/text]nese; Kurmanji Kurdish; Mongolian; Pashto; Somali; Tamil; Vietnamese;
D O I
10.21437/Interspeech.2021-1930
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
In 2020, the National Institute of Standards and Technology (NIST), in cooperation with the Intelligence Advanced Research Project Activity (IARPA), conducted an open challenge on automatic speech recognition (ASR) technology for low-resource languages on a challenging data type - conversational telephone speech. The OpenASR20 Challenge was offered for ten low-resource languages - Amharic, Cantonese, Guarani, Javanese, Kurmanji Kurdish, Mongolian, Pashto, Somali, Tamil, and Vietnamese. A total of nine teams from five countries fully participated, and 128 valid submissions were scored. This paper gives an overview of the challenge setup and procedures, as well as a summary of the results. The results show overall high word error rate (WER), with the best results on a severely constrained training data condition ranging from 0.4 to 0.65, depending on the language. ASR with such limited resources remains a challenging problem. Providing a computing platform may be a way to level the playing field and encourage wider participation in challenges like OpenASR.
引用
收藏
页码:4324 / 4328
页数:5
相关论文
共 14 条
  • [1] Beermann D., 2020, P 1 JOINT WORKSH SPO
  • [2] Automatic speech recognition for under-resourced languages: A survey
    Besacier, Laurent
    Barnard, Etienne
    Karpov, Alexey
    Schultz, Tanja
    [J]. SPEECH COMMUNICATION, 2014, 56 : 85 - 100
  • [3] IARPA, 2021, MATERIAL
  • [4] IARPA, 2016, BABEL
  • [5] Martin AF, 2007, 2007 IEEE WORKSHOP ON SIGNAL PROCESSING APPLICATIONS FOR PUBLIC SECURITY AND FORENSICS, P32
  • [6] NIST, 2020, OPENASR20 CHALL EV P
  • [7] NIST, 2018, SCTK NIST SCOR TOOLK
  • [8] NIST, 2021, OPENASR CHALL
  • [9] NIST, 2013, IARPA BAB DAT SPEC P
  • [10] NIST, 2009, RICH TRANSCR EV