SPEECH TO SPEECH BASED EFFORTLESS MALAYALAM DICTIONARY USING KALDI AND EFFECT OF CVR MODIFICATION ON ISOLATED WORD RECOGNITION

被引:0
作者
Jeethu, Mary A. J. [1 ]
Jayan, A. R. [1 ]
机构
[1] Govt Engn Coll, Dept ECE, Trichur, India
来源
2022 IEEE 19TH INDIA COUNCIL INTERNATIONAL CONFERENCE, INDICON | 2022年
关键词
Automatic Speech Recognition; MFCC; Kaldi; Malayalam; CVR Modification;
D O I
10.1109/INDICON56171.2022.10039854
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Speech is one of the natural and easiest ways of communication among human beings. The applications for speech recognition are rapidly growing day by day. Malayalam is a Dravidian language spoken by 34 million people in India by the people in Kerala, the union territories of Lakshadweep and Puducherry, and it is a low resource language. The objective of this paper is to develop a speech to speech Malayalam dictionary based on Automatic Speech Recognition (ASR) using the Kaldi ASR toolkit. The dictionary consists of hundred isolated utterances in Malayalam. The meaning of these complex words can be obtained as an output speech utterance from the system when the system is prompted with the corresponding speech input. Kaldi toolkit is highly flexible and adaptive. We have used Mel Frequency Cepstral Coefficients (MFCC) as features in the ASR Module. The recognizer is tested with two sets of test data. The first set obtained a WER of 29% for monophone training, 21% for triphone training, 17% for triphone LDA training and 16% for triphone SAT training respectively. The second set obtained a WER of 19% for speaker 1, and 10% for speaker 2, for triphone SAT training. Using the same setup, it is possible to increase the vocabulary size so as to handle all the unfamiliar words in Malayalam. This will help users to use it as a fully functional speech activated Malayalam dictionary. The paper also examines the effect of Consonant Vowel Ratio (CVR) modification on isolated Malayalam word recognition when the speech input is corrupted by additive noise.
引用
收藏
页数:6
相关论文
共 12 条
[1]  
Arjun P., 2017, INT C INVENTIVE SYST, P1
[2]  
Babu L B, 2018, PROC INT C EMERGING, P1
[3]  
Bhardwaj Vivek, 2020, Proceedings of Second International Conference on Inventive Research in Computing Applications (ICIRCA 2020), P10, DOI 10.1109/ICIRCA48905.2020.9182941
[4]  
fon.hum.uva.nl, US
[5]   Automated modification of consonant–vowel ratio of stops for improving speech intelligibility [J].
Jayan A.R. ;
Pandey P.C. .
International Journal of Speech Technology, 2014, 18 (01) :113-130
[6]  
/kaldi-asr.org, US
[7]  
Lekshmi K. R., 2021, Proceedings of the 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS), P972, DOI 10.1109/ICAIS50930.2021.9395945
[8]   Quantitative Analysis of the Morphological Complexity of Malayalam Language [J].
Manohar, Kavya ;
Jayan, A. R. ;
Rajan, Rajeev .
TEXT, SPEECH, AND DIALOGUE (TSD 2020), 2020, 12284 :71-78
[9]  
Moncy Ashana Mariam, 2020, 2020 IEEE Recent Advances in Intelligent Computational Systems (RAICS), P170, DOI 10.1109/RAICS51191.2020.9332493
[10]   SPEAKING CLEARLY FOR THE HARD OF HEARING .1. INTELLIGIBILITY DIFFERENCES BETWEEN CLEAR AND CONVERSATIONAL SPEECH [J].
PICHENY, MA ;
DURLACH, NI ;
BRAIDA, LD .
JOURNAL OF SPEECH AND HEARING RESEARCH, 1985, 28 (01) :96-103