Feature extraction and classification efficiency analysis using machine learning approach for speech signal

被引:0
作者
Mahesh K. Singh
机构
[1] Aditya Engineering College,Department of ECE
来源
Multimedia Tools and Applications | 2024年 / 83卷
关键词
Classification; Feature extraction; Accuracy; SVM; Speech signal;
D O I
暂无
中图分类号
学科分类号
摘要
The problem of classification efficiency analysis in speech signals for speaker identification using a machine learning approach has been a challenge for several years. The classification using the machine learning approach maps the speaker’s data of attention into several segments. For the speaker’s classification system, segments represent a unique speaker. This manuscript used the principal component analysis (PCA) technique used in which speech signals are converted into a set of vectors by using feature extraction. Once this is done, a speaker model can be developed for use in further speaker classification. Here dual classifiers using machine learning approaches like support vector machine (SVM) and k- nearest neighbour (k-NN) are proposed and utilized to determine the relationship between the speaker voice in the model and the input test voice. It has been discussed about the SVM and k-NN, algorithms that have been shown the better classification efficiency in speaker identification. The k-NN classifier has considerably higher detection rates compared to the SVM classifier. The proposed result on k-NN may be inconsistent with some other assessments concerning to SVM. Here advantages of k-NN is presented over other machine learning algorithm as it has attained an outstanding classification efficiency of 94.45% using the k-NN classifier and a significantly lower identification rate of 92.90% using SVM. The proposed time-frequency changing averaging factor in conventional subtraction procedures improves voice quality metrics for diverse noise kinds and signal-to-noise ratio (SNR) levels. This manuscript also enhances voice classification efficiency in terms of Accuracy, sensitivity, specificity, precision, recall, and F-1 measure using machine learning. The authentication results for identified features such as accuracy, sensitivity, and specificity were calculated as 55.59%, 61.11%, and 52.55% respectively. Precision, recall, and F1-measure derived the result 57.14%, 61.11%, and 57.33% respectively. Existing approaches were thoroughly examined to build a better classification system.
引用
收藏
页码:47069 / 47084
页数:15
相关论文
共 56 条
[11]  
Tekin R(2020)Trends in audio signal feature extraction methods Appl Acoust 158 107020-260
[12]  
Yıldırım S(2017)An overview of audio event detection methods from feature extraction to classification Appl Artif Intell 31 661-e11
[13]  
Dargan S(2020)Speech emotion recognition with deep learning Procedia Comput Sci 176 251-682
[14]  
Kumar M(2019)A survey on machine learning approaches for automatic detection of voice disorders J Voice 33 947-11
[15]  
Ayyagari MR(2020)English speech recognition is based on deep learning with multiple features Computing 102 663-undefined
[16]  
Kumar G(2019)Feature extraction and classification of heart sound using 1D convolutional neural networks EURASIP J Adv Signal Process 2019 1-undefined
[17]  
Singh M(undefined)undefined undefined undefined undefined-undefined
[18]  
Nandan D(undefined)undefined undefined undefined undefined-undefined
[19]  
Kumar S(undefined)undefined undefined undefined undefined-undefined
[20]  
Bhangale KB(undefined)undefined undefined undefined undefined-undefined