Exploring end-to-end framework towards Khasi speech recognition system

被引：4

作者：

Syiem, Bronson ^{[1
]}

Singh, L. Joyprakash ^{[1
]}

机构：

[1] NEHU, Elect & Commun Engn, Shillong 793022, Meghalaya, India

来源：

INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY | 2021年 / 24卷 / 02期

关键词：

Automatic speech recognition; Deep neural network; End-to-End; Hidden Markov model;

D O I：

10.1007/s10772-021-09811-5

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Building a conventional automatic speech recognition (ASR) system based on hidden Markov model (HMM)/deep neural network (DNN) makes the system complex as it requires various modules such as acoustic, lexicon, linguistic resources, language models etc. particularly with the low resource languages. In contrast, End-to-End architecture has greatly simplifies the model building process by representing complex modules with a simple deep network and by replacing the use of linguistic resources with a data-driven learning techniques. In this paper, we present our prior work by exploring End-to-End (E2E) framework for Khasi speech recognition system and the novel extension towards the development of speech corpora for standard Khasi dialect. We implemented the proposed E2E model by using Nabu ASR toolkit. Additionally, three other models (monophone, triphone and hybrid DNN) were built. Comparing the results, significant improvement was achieved using the proposed method particularly with the connectionist temporal classification (CTC) with a character error rate (CER) of 5.04%.

引用

页码：419 / 424

页数：6

共 50 条

[1] Exploring end-to-end framework towards Khasi speech recognition system
Bronson Syiem
L. Joyprakash Singh
International Journal of Speech Technology, 2021, 24 : 419 - 424
[2] Towards end-to-end speech recognition with transfer learning
Qin, Chu-Xiong
Qu, Dan
Zhang, Lian-Hai
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2018,
[3] Towards end-to-end speech recognition with transfer learning
Chu-Xiong Qin
Dan Qu
Lian-Hai Zhang
EURASIP Journal on Audio, Speech, and Music Processing, 2018
[4] An Overview of End-to-End Automatic Speech Recognition
Wang, Dong
Wang, Xiaodong
Lv, Shaohe
SYMMETRY-BASEL, 2019, 11 (08):
[5] EXPLORING MODEL UNITS AND TRAINING STRATEGIES FOR END-TO-END SPEECH RECOGNITION
Huang, Mingkun
Lu, Yizhou
Wang, Lan
Qian, Yanmin
Yu, Kai
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 524 - 531
[6] End-to-end Accented Speech Recognition
Viglino, Thibault
Motlicek, Petr
Cernak, Milos
INTERSPEECH 2019, 2019, : 2140 - 2144
[7] A Lightweight End-to-End Speech Recognition System on Embedded Devices
Wang, Yu
Nishizaki, Hiromitsu
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (07) : 1230 - 1239
[8] Hybrid end-to-end model for Kazakh speech recognition
Mamyrbayev O.Z.
Oralbekova D.O.
Alimhan K.
Nuranbayeva B.M.
International Journal of Speech Technology, 2023, 26 (02) : 261 - 270
[9] Development of CRF and CTC Based End-To-End Kazakh Speech Recognition System
Oralbekova, Dina
Mamyrbayev, Orken
Othman, Mohamed
Alimhan, Keylan
Zhumazhanov, Bagashar
Nuranbayeva, Bulbul
INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2022, PT I, 2022, 13757 : 519 - 531
[10] End-to-end speech recognition using lattice-free MMI
Hadian, Hossein
Sameti, Hossein
Povey, Daniel
Khudanpur, Sanjeev
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 12 - 16

← 1 2 3 4 5 →