Bangladeshi Bangla speech corpus for automatic speech recognition research

被引:7
|
作者
Kibria, Shafkat [1 ]
Samin, Ahnaf Mozib [1 ]
Kobir, M. Humayon [1 ]
Rahman, M. Shahidur [1 ]
Selim, M. Reza [1 ]
Iqbal, M. Zafar [1 ]
机构
[1] Shahjalal Univ Sci & Technol, Dept Comp Sci & Engn, Sylhet 3114, Bangladesh
关键词
Bangladeshi bangla corpus; Automatic speech recognition; Corpora evaluation; Recurrent neural network;
D O I
10.1016/j.specom.2021.12.004
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This article reports the development of language resource for Bangladeshi Bangla spoken language (BBSL). Bangladeshi Bangla has inadequate large speech corpora for Large Vocabulary Continuous Speech Recognition (LVCSR) system. The accuracy of the automatic speech recognition (ASR) system rests on the quality of the speech corpus. This work discusses the common issues and activities related to the development of a large speech corpus named (sic) (SUBAK.KO). This corpus is designed to support ASR research in Bangladeshi Bangla. It has been labeled sentence-wise. We have trained this corpus with one of the well-known current End-to-End ASR algorithms, Recurrent Neural Networks (RNNs) with Connectionist Temporal Classification (CTC). To know the strengths and weaknesses, the CER (Character Error Rate) and the WER (Word Error Rate) of the trained RNN-CTC model have been observed. Another open-source large Bangla ASR corpus has been trained using the same ASR algorithm. Both trained models have been compared to assess the quality of these corpora. It has been found that SUBAK.KO is a more balanced corpus and considered more regional accented speech variability for a LVCSR system compared to that open-source large Bangla ASR corpus.
引用
收藏
页码:84 / 97
页数:14
相关论文
共 50 条
  • [1] Chhattisgarhi speech corpus for research and development in automatic speech recognition
    Londhe, Narendra D.
    Kshirsagar, Ghanahshyam B.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2018, 21 (02) : 193 - 210
  • [2] Gender Independent Bangla Automatic Speech Recognition
    Hassan, Foyzul
    Kotwal, Mohammed Rokibul Alam
    Khan, Mohammad Saiful Alam
    Huda, Mohammad Nurul
    2012 INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV), 2012, : 143 - 148
  • [3] Phonetic Features Enhancement for Bangla Automatic Speech Recognition
    Kabir, Sharif M. Rasel
    Hassan, Foyzul
    Ahamed, Foysal
    Mamun, Khondokar
    Huda, Mohammad Nurul
    Nusrat, Fariha
    2015 INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION ENGINEERING (ICCIE), 2015, : 25 - 28
  • [4] The Makerere Radio Speech Corpus: A Luganda Radio Corpus for Automatic Speech Recognition
    Mukiibi, Jonathan
    Katumba, Andrew
    Nakatumba-Nabende, Joyce
    Hussein, Ali
    Meyer, Josh
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1945 - 1954
  • [5] RSC: A Romanian Read Speech Corpus for Automatic Speech Recognition
    Georgescu, Alexandru-Lucian
    Cucu, Horia
    Buzo, Andi
    Burileanu, Corneliu
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6606 - 6612
  • [6] KsponSpeech: Korean Spontaneous Speech Corpus for Automatic Speech Recognition
    Bang, Jeong-Uk
    Yun, Seung
    Kim, Seung-Hi
    Choi, Mu-Yeol
    Lee, Min-Kyu
    Kim, Yeo-Jeong
    Kim, Dong-Hyun
    Park, Jun
    Lee, Young-Jik
    Kim, Sang-Hun
    APPLIED SCIENCES-BASEL, 2020, 10 (19): : 1 - 17
  • [7] Multimodal English corpus for automatic speech recognition
    Kunka, Bartosz
    Kupryjanow, Adam
    Dalka, Piotr
    Bratoszewski, Piotr
    Szczodrak, Maciej
    Spaleniak, Pawel
    Szykulski, Marcin
    Czyzewski, Andrzej
    2013 SIGNAL PROCESSING: ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA), 2013, : 106 - 111
  • [8] CEASR: A Corpus for Evaluating Automatic Speech Recognition
    Ulasik, Malgorzata Anna
    Huerlimann, Manuela
    Germann, Fabian
    Gedik, Esin
    Benites, Fernando
    Cieliebak, Mark
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6477 - 6485
  • [9] Towards a Continuous Speech Corpus for Banking Domain Automatic Speech Recognition
    Suciu, George
    Toma, Stefan-Adrian
    Cheyeresan, Romulus
    2017 INTERNATIONAL CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN-COMPUTER DIALOGUE (SPED), 2017,
  • [10] The Development of Isolated Words Corpus of Pashto for the Automatic Speech Recognition Research
    Ahmed, Irfan
    Ahmad, Nasir
    Ali, Hazrat
    Ahmad, Gulzar
    2012 INTERNATIONAL CONFERENCE ON ROBOTICS AND ARTIFICIAL INTELLIGENCE (ICRAI), 2012, : 139 - 143