Bangladeshi Bangla speech corpus for automatic speech recognition research

被引:7
|
作者
Kibria, Shafkat [1 ]
Samin, Ahnaf Mozib [1 ]
Kobir, M. Humayon [1 ]
Rahman, M. Shahidur [1 ]
Selim, M. Reza [1 ]
Iqbal, M. Zafar [1 ]
机构
[1] Shahjalal Univ Sci & Technol, Dept Comp Sci & Engn, Sylhet 3114, Bangladesh
关键词
Bangladeshi bangla corpus; Automatic speech recognition; Corpora evaluation; Recurrent neural network;
D O I
10.1016/j.specom.2021.12.004
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This article reports the development of language resource for Bangladeshi Bangla spoken language (BBSL). Bangladeshi Bangla has inadequate large speech corpora for Large Vocabulary Continuous Speech Recognition (LVCSR) system. The accuracy of the automatic speech recognition (ASR) system rests on the quality of the speech corpus. This work discusses the common issues and activities related to the development of a large speech corpus named (sic) (SUBAK.KO). This corpus is designed to support ASR research in Bangladeshi Bangla. It has been labeled sentence-wise. We have trained this corpus with one of the well-known current End-to-End ASR algorithms, Recurrent Neural Networks (RNNs) with Connectionist Temporal Classification (CTC). To know the strengths and weaknesses, the CER (Character Error Rate) and the WER (Word Error Rate) of the trained RNN-CTC model have been observed. Another open-source large Bangla ASR corpus has been trained using the same ASR algorithm. Both trained models have been compared to assess the quality of these corpora. It has been found that SUBAK.KO is a more balanced corpus and considered more regional accented speech variability for a LVCSR system compared to that open-source large Bangla ASR corpus.
引用
收藏
页码:84 / 97
页数:14
相关论文
共 50 条
  • [21] Automatic speech recognition: a survey
    Mishaim Malik
    Muhammad Kamran Malik
    Khawar Mehmood
    Imran Makhdoom
    Multimedia Tools and Applications, 2021, 80 : 9411 - 9457
  • [22] Automatic speech recognition: a survey
    Malik, Mishaim
    Malik, Muhammad Kamran
    Mehmood, Khawar
    Makhdoom, Imran
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (06) : 9411 - 9457
  • [23] Using adaptive filter to increase automatic speech recognition rate in a digit corpus
    Oropeza Rodriguez, Jose Luis
    Suarez Guerra, Sergio
    Sanchez Fernandez, Luis Pastor
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS AND APPLICATIONS, PROCEEDINGS, 2007, 4756 : 78 - 87
  • [24] Efficient automatic speech recognition
    O'Shaughnessy, D
    PROCEEDINGS OF THE EIGHTH IASTED INTERNATIONAL CONFERENCE ON INTERNET AND MULTIMEDIA SYSTEMS AND APPLICATIONS, 2004, : 323 - 327
  • [25] Towards Automatic Assessment of Aphasia Speech Using Automatic Speech Recognition Techniques
    Qin, Ying
    Lee, Tan
    Kong, Anthony Pak Hin
    Law, Sam Po
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [26] Building a Speech and Text Corpus of Turkish: Large Corpus Collection with Initial Speech Recognition Results
    Polat, Huseyin
    Oyucu, Saadin
    SYMMETRY-BASEL, 2020, 12 (02):
  • [27] Automatic Construction of the Finnish Parliament Speech Corpus
    Mansikkaniemi, Andre
    Smit, Peter
    Kurimo, Mikko
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3762 - 3766
  • [28] Automatic Speech Correction: A step to Speech Recognition for People with Disabilities
    Terbeh, Naim
    Labidi, Mohamed
    Zrigui, Mounir
    2013 FOURTH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY AND ACCESSIBILITY (ICTA), 2013,
  • [29] Real and synthetic Punjabi speech datasets for automatic speech recognition
    Singh, Satwinder
    Hou, Feng
    Wang, Ruili
    DATA IN BRIEF, 2024, 52
  • [30] Autonomous measurement of speech intelligibility utilizing automatic speech recognition
    Meyer, Bernd T.
    Kollmeier, Birger
    Ooster, Jasper
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2982 - 2986