DWT features performance analysis for automatic speech recognition of Urdu

被引:15
作者
Ali, Hazrat [1 ,2 ]
Ahmad, Nasir [3 ]
Zhou, Xianwei [2 ]
Iqbal, Khalid [2 ]
Ali, Sahibzada Muhammad [4 ]
机构
[1] City Univ London, Dept Comp, Machine Learning Grp, London EC1V 0HB, England
[2] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing 100083, Peoples R China
[3] Univ Engn & Technol Peshawar, Dept Comp Syst Engn, Peshawar 25120, Pakistan
[4] N Dakota State Univ, Dept Elect & Comp Engn, Fargo, ND 58108 USA
来源
SPRINGERPLUS | 2014年 / 3卷
关键词
Automatic speech recognition; Discrete wavelet transforms; Linear discriminant analysis; Mel-frequency cepstral coefficients; Urdu isolated words recognition; WAVELET;
D O I
10.1186/2193-1801-3-204
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
This paper presents the work on Automatic Speech Recognition of Urdu language, using a comparative analysis for Discrete Wavelets Transform (DWT) based features and Mel Frequency Cepstral Coefficients (MFCC). These features have been extracted for one hundred isolated words of Urdu, each word uttered by ten different speakers. The words have been selected from the most frequently used words of Urdu. A variety of age and dialect has been covered by using a balanced corpus approach. After extraction of features, the classification has been achieved by using Linear Discriminant Analysis. After the classification task, the confusion matrix obtained for the DWT features has been compared with the one obtained for Mel-Frequency Cepstral Coefficients based speech recognition. The framework has been trained and tested for speech data recorded under controlled environments. The experimental results are useful in determination of the optimum features for speech recognition task.
引用
收藏
页码:1 / 10
页数:10
相关论文
共 31 条
  • [21] Mallat SG., 1999, WAVELET TOUR SIGNAL
  • [22] A TUTORIAL ON HIDDEN MARKOV-MODELS AND SELECTED APPLICATIONS IN SPEECH RECOGNITION
    RABINER, LR
    [J]. PROCEEDINGS OF THE IEEE, 1989, 77 (02) : 257 - 286
  • [23] Raza AA, 2010, OR COCOSDA 2010 C NE, P1
  • [24] Raza AA, 2009, 2009 OR COCOSDA INT
  • [25] Sarfraz H., 2010, P O COCOSDA KATHM NE
  • [26] Scalable Large-Margin Mahalanobis Distance Metric Learning
    Shen, Chunhua
    Kim, Junae
    Wang, Lei
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2010, 21 (09): : 1524 - 1530
  • [27] Tan BT, 1996, ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, P2431, DOI 10.1109/ICSLP.1996.607300
  • [28] Tufekci Z., 2000, Proceedings of the IEEE SoutheastCon 2000. `Preparing for The New Millennium' (Cat. No.00CH37105), P116, DOI 10.1109/SECON.2000.845444
  • [29] Varile G, 1995, SURVEY STATE ART HUM
  • [30] Villaseñor-Pineda L, 2004, LECT NOTES COMPUT SC, V2945, P416