Feature extraction and classification efficiency analysis using machine learning approach for speech signal

被引：0

作者：

Mahesh K. Singh

机构：

[1] Aditya Engineering College,Department of ECE

来源：

Multimedia Tools and Applications | 2024年 / 83卷

关键词：

Classification; Feature extraction; Accuracy; SVM; Speech signal;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The problem of classification efficiency analysis in speech signals for speaker identification using a machine learning approach has been a challenge for several years. The classification using the machine learning approach maps the speaker’s data of attention into several segments. For the speaker’s classification system, segments represent a unique speaker. This manuscript used the principal component analysis (PCA) technique used in which speech signals are converted into a set of vectors by using feature extraction. Once this is done, a speaker model can be developed for use in further speaker classification. Here dual classifiers using machine learning approaches like support vector machine (SVM) and k- nearest neighbour (k-NN) are proposed and utilized to determine the relationship between the speaker voice in the model and the input test voice. It has been discussed about the SVM and k-NN, algorithms that have been shown the better classification efficiency in speaker identification. The k-NN classifier has considerably higher detection rates compared to the SVM classifier. The proposed result on k-NN may be inconsistent with some other assessments concerning to SVM. Here advantages of k-NN is presented over other machine learning algorithm as it has attained an outstanding classification efficiency of 94.45% using the k-NN classifier and a significantly lower identification rate of 92.90% using SVM. The proposed time-frequency changing averaging factor in conventional subtraction procedures improves voice quality metrics for diverse noise kinds and signal-to-noise ratio (SNR) levels. This manuscript also enhances voice classification efficiency in terms of Accuracy, sensitivity, specificity, precision, recall, and F-1 measure using machine learning. The authentication results for identified features such as accuracy, sensitivity, and specificity were calculated as 55.59%, 61.11%, and 52.55% respectively. Precision, recall, and F1-measure derived the result 57.14%, 61.11%, and 57.33% respectively. Existing approaches were thoroughly examined to build a better classification system.

引用

页码：47069 / 47084

页数：15

共 56 条

[1]

Abdusalomov AB(2022)Improved feature parameter extraction from Speech signals using machine learning algorithm Sensors 22 8122-46345

[2]

Safarov F(2020)Feature extraction and analysis of natural language processing for deep learning English language IEEE Access 8 46335-219

[3]

Rakhimov M(2014)1D-local binary pattern-based feature extraction for classification of epileptic EEG signals Appl Math Comput 243 209-1092

[4]

Turaev B(2020)A survey of deep learning and its applications: a new paradigm to machine learning Arch Comput Methods Eng 27 1071-461

[5]

Whangbo TK(2019)Statistical analysis of lower and raised Pitch Voice Signal and its efficiency calculation Traitement du Signal 36 455-388

[6]

Wang D(2021)A review on speech processing using machine learning paradigm Int J Speech Technol 24 367-29411

[7]

Su J(2019)Multimedia analysis for disguised voice and classification efficiency Multimed Tools Appl 78 29395-117345

[8]

Yu H(2019)Speech emotion recognition using deep learning techniques: a review IEEE Access 7 117327-134

[9]

Kaya Y(2021)Performance evaluation and comparison using deep learning techniques in sentiment analysis J Soft Comput Paradigm (JSCP) 3 123-35552

[10]

Uyar M(2020)Multimedia utilization of non-computerized disguised voice and acoustic similarity measurement Multimed Tools Appl 79 35537-714

← 1 2 3 4 5 6 →