Efficient feature extraction and classification for the development of Pashto speech recognition system

被引:0
作者
Irfan Ahmed
Muhammad Abeer Irfan
Abid Iqbal
Amaad Khalil
Salman Ilahi Siddiqui
机构
[1] University of Engineering and Technology Peshawar,Department of Electrical Engineering
[2] Jalozai Campus,Department of Computer Systems Engineering
[3] University of Engineering and Technology Peshawar,undefined
来源
Multimedia Tools and Applications | 2024年 / 83卷
关键词
Automatic speech recognition (ASR); Machine learning (ML); Feature extraction; MFCC; DWT; SVM; -NN;
D O I
暂无
中图分类号
学科分类号
摘要
In this work, a novel framework for the efficient feature extraction and recognition of Pashto speech signals is proposed. The targeted language is one of the low-resource languages and prone to higher Automatic Speech Recognition (ASR) errors due to the availability of its colloquial dialects. We devised a framework which not only employed classical Machine Learning (ML) models for speech recognition tasks, but also achieved a higher level of performance accuracy by using the optimal feature extraction techniques. The designed frameworks for feature extraction are based on two well-know feature extraction techniques: Discrete Wavelet Transform (DWT )coefficients and Mel-Frequency Cepstral Coefficients (MFCC). In our work, we deployed classical ML models i.e., Support Vector Machine (SVM) and K-Nearest Neighbors (k-NN), due to their efficiency in terms of computation complexity, energy efficiency, and higher accuracy as compared to other ML and Deep Learning (DL) model. Hence, our proposed framework exhibited improved performance level when trained on a Pashto isolated words dataset.
引用
收藏
页码:54081 / 54096
页数:15
相关论文
共 34 条
  • [1] Ahmed I(2020)Speech signal recovery using block sparse bayesian learning Arab J Sci Eng 45 1567-1579
  • [2] Khan A(2021)Efficient measurement matrix for speech compressive sampling Multimed Tools Appl 80 20327-20343
  • [3] Ahmad N(2022)Sparse signal representation, sampling, and recovery in compressive sensing frameworks IEEE Access 10 85002-85018
  • [4] Ali H(2023)Learning based speech compressive subsampling Multimed Tools Appl 82 15327-15343
  • [5] Ahmed I(2018)Arabic isolated word recognition system using hybrid feature extraction techniques and neural network Int J Speech Technol 21 29-37
  • [6] Khan A(2011)Hindi speech recognition system using htk Int J Comput Bus Res 2 2229-6166
  • [7] Khan A(2020)Analysis of the error pattern of hmm based bangla asr Int J Image Graph Signal Process 12 1-9
  • [8] Mujahid K(2016)Offline cursive urdu-nastaliq script recognition using multidimensional recurrent neural networks Neurocomputing 177 228-241
  • [9] Khan N(2022)Sentiment analysis of social media content in pashto language using deep learning algorithms J Internet Technol 23 1669-1677
  • [10] Ahmed I(undefined)undefined undefined undefined undefined-undefined