Development of Standard YorA(1)ba speech-to-text system using HTK

被引:5
作者
Adetunmbi, O. A. [1 ]
Obe, O. O. [1 ]
Iyanda, J. N. [2 ]
机构
[1] Fed Univ Technol Akure, Dept Comp Sci, Akure, Ondo State, Nigeria
[2] Joseph Ayo Babalola Univ, Dept Comp Sci, Ikeji Arakeji, Osun State, Nigeria
关键词
Standard Yoruba language; MFCC; HTK; GUI; Praat;
D O I
10.1007/s10772-016-9380-2
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, a Standard YorA(1)ba speech-to-text system capable of recognizing isolated words spoken by the users based on previously stored data was designed and implemented. This system adopted syllable-based approach, and carefully-selected words were recorded, analyzed and annotated, using Praat software. An experimental database of six native speakers was taken, each speaking 25 bi-syllabic and 25 tri-syllabic words, under an acoustically-controlled room. The meaningful spectral coefficients were successfully extracted using Mel-frequency cepstral coefficients technique and Hidden Markov Model Toolkit was used to implement the system. A graphical user interface was also developed to make the system accessible and more interactive. Furthermore, the system was tested and evaluated based on the perception of native speakers of the language. The overall accuracy for bi-syllabic and tri-syllabic words was 76 and 84 % respectively. These results obtained for both bi and tri-syllabic words showed that this system was a promising approach that could be adopted for Standard YorA(1)ba continuous speech recognition system as this will make the system useable for the foreign speaker.
引用
收藏
页码:929 / 944
页数:16
相关论文
共 26 条
[1]  
Abdul-Wahab F. A., 2013, INT C RUR ICT DEV ME, P116
[2]  
Adeniran W., 2015, YOR DAY CEL STOCKH S
[3]  
Afolabi A. O., 2013, INT J ENG RES TECHNO, V2, P1055
[4]  
Ahmad M. A., 2010, INT ISL U MAL INT C, P11
[5]  
Bamgbose A., 1969, 12 NIGERIAN LANGUAGE, P166
[6]  
Cini K., 2012, INT J COMPUTING BUSI, V3
[7]  
Das R., 2013, BHARATI INT J INFORM, V2, P237
[8]  
Dopamu P. A., 2004, UNDERSTANDING YORUBA
[9]  
Dua M., 2013, PUNJABI CONTINUOUS S
[10]   Progress in the CU-HTK broadcast news transcription system [J].
Gales, Mark J. F. ;
Kim, Do Yeong ;
Woodland, Philip C. ;
Chan, Ho Yin ;
Mrva, David ;
Sinha, Rohit ;
Tranter, Sue E. .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (05) :1513-1525