Bangladeshi Bangla speech corpus for automatic speech recognition research

被引:7
|
作者
Kibria, Shafkat [1 ]
Samin, Ahnaf Mozib [1 ]
Kobir, M. Humayon [1 ]
Rahman, M. Shahidur [1 ]
Selim, M. Reza [1 ]
Iqbal, M. Zafar [1 ]
机构
[1] Shahjalal Univ Sci & Technol, Dept Comp Sci & Engn, Sylhet 3114, Bangladesh
关键词
Bangladeshi bangla corpus; Automatic speech recognition; Corpora evaluation; Recurrent neural network;
D O I
10.1016/j.specom.2021.12.004
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This article reports the development of language resource for Bangladeshi Bangla spoken language (BBSL). Bangladeshi Bangla has inadequate large speech corpora for Large Vocabulary Continuous Speech Recognition (LVCSR) system. The accuracy of the automatic speech recognition (ASR) system rests on the quality of the speech corpus. This work discusses the common issues and activities related to the development of a large speech corpus named (sic) (SUBAK.KO). This corpus is designed to support ASR research in Bangladeshi Bangla. It has been labeled sentence-wise. We have trained this corpus with one of the well-known current End-to-End ASR algorithms, Recurrent Neural Networks (RNNs) with Connectionist Temporal Classification (CTC). To know the strengths and weaknesses, the CER (Character Error Rate) and the WER (Word Error Rate) of the trained RNN-CTC model have been observed. Another open-source large Bangla ASR corpus has been trained using the same ASR algorithm. Both trained models have been compared to assess the quality of these corpora. It has been found that SUBAK.KO is a more balanced corpus and considered more regional accented speech variability for a LVCSR system compared to that open-source large Bangla ASR corpus.
引用
收藏
页码:84 / 97
页数:14
相关论文
共 50 条
  • [31] Refining maritime Automatic Speech Recognition by leveraging synthetic speech
    Martius, Christoph
    Nakilcioglu, Emin Cagatay
    Reimann, Maximilian
    John, Ole
    MARITIME TRANSPORT RESEARCH, 2024, 7
  • [32] Validation of Speech Data for Training Automatic Speech Recognition Systems
    Krizaj, Janes
    Gros, Jerneja Zganec
    Dobrisek, Simon
    2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 1165 - 1169
  • [33] BembaSpeech: A Speech Recognition Corpus for the Bemba Language
    Sikasote, Claytone
    Anastasopoulos, Antonios
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 7277 - 7283
  • [34] Continual Learning in Automatic Speech Recognition
    Sadhu, Samik
    Hermansky, Hynek
    INTERSPEECH 2020, 2020, : 1246 - 1250
  • [35] The WaveSurfer Automatic Speech Recognition Plugin
    Salvi, Giampiero
    Vanhainen, Niklas
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 3067 - 3071
  • [36] Arabic Automatic Speech Recognition Enhancement
    Ahmed, Basem H. A.
    Ghabayen, Ayman S.
    2017 PALESTINIAN INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (PICICT), 2017, : 98 - 102
  • [37] Automatic speech recognition in neurodegenerative disease
    Benjamin G. Schultz
    Venkata S. Aditya Tarigoppula
    Gustavo Noffs
    Sandra Rojas
    Anneke van der Walt
    David B. Grayden
    Adam P. Vogel
    International Journal of Speech Technology, 2021, 24 : 771 - 779
  • [38] Graphical models and automatic speech recognition
    Bilmes, JA
    MATHEMATICAL FOUNDATIONS OF SPEECH AND LANGUAGE PROCESSING, 2004, 138 : 191 - 245
  • [39] Automatic speech recognition in neurodegenerative disease
    Schultz, Benjamin G.
    Tarigoppula, Venkata S. Aditya
    Noffs, Gustavo
    Rojas, Sandra
    van der Walt, Anneke
    Grayden, David B.
    Vogel, Adam P.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (03) : 771 - 779
  • [40] Allophones in Automatic Whispery Speech Recognition
    Kozierski, Piotr
    Sadalla, Talar
    Drgas, Szymon
    Dabrowski, Adam
    2016 21ST INTERNATIONAL CONFERENCE ON METHODS AND MODELS IN AUTOMATION AND ROBOTICS (MMAR), 2016, : 811 - 815