Feature extraction and classification efficiency analysis using machine learning approach for speech signal

被引:1
作者
Singh, Mahesh K. [1 ]
机构
[1] Aditya Engn Coll, Dept ECE, Surampalem, India
关键词
Classification; Feature extraction; Accuracy; SVM; Speech signal; VOICE;
D O I
10.1007/s11042-023-17368-5
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The problem of classification efficiency analysis in speech signals for speaker identification using a machine learning approach has been a challenge for several years. The classification using the machine learning approach maps the speaker's data of attention into several segments. For the speaker's classification system, segments represent a unique speaker. This manuscript used the principal component analysis (PCA) technique used in which speech signals are converted into a set of vectors by using feature extraction. Once this is done, a speaker model can be developed for use in further speaker classification. Here dual classifiers using machine learning approaches like support vector machine (SVM) and k- nearest neighbour (k-NN) are proposed and utilized to determine the relationship between the speaker voice in the model and the input test voice. It has been discussed about the SVM and k-NN, algorithms that have been shown the better classification efficiency in speaker identification. The k-NN classifier has considerably higher detection rates compared to the SVM classifier. The proposed result on k-NN may be inconsistent with some other assessments concerning to SVM. Here advantages of k-NN is presented over other machine learning algorithm as it has attained an outstanding classification efficiency of 94.45% using the k-NN classifier and a significantly lower identification rate of 92.90% using SVM. The proposed time-frequency changing averaging factor in conventional subtraction procedures improves voice quality metrics for diverse noise kinds and signal-to-noise ratio (SNR) levels. This manuscript also enhances voice classification efficiency in terms of Accuracy, sensitivity, specificity, precision, recall, and F-1 measure using machine learning. The authentication results for identified features such as accuracy, sensitivity, and specificity were calculated as 55.59%, 61.11%, and 52.55% respectively. Precision, recall, and F1-measure derived the result 57.14%, 61.11%, and 57.33% respectively. Existing approaches were thoroughly examined to build a better classification system.
引用
收藏
页码:47069 / 47084
页数:16
相关论文
共 22 条
[1]   Improved Feature Parameter Extraction from Speech Signals Using Machine Learning Algorithm [J].
Abdusalomov, Akmalbek Bobomirzaevich ;
Safarov, Furkat ;
Rakhimov, Mekhriddin ;
Turaev, Boburkhon ;
Whangbo, Taeg Keun .
SENSORS, 2022, 22 (21)
[2]   Two-Way Feature Extraction for Speech Emotion Recognition Using Deep Learning [J].
Aggarwal, Apeksha ;
Srivastava, Akshat ;
Agarwal, Ajay ;
Chahal, Nidhi ;
Singh, Dilbag ;
Alnuaim, Abeer Ali ;
Alhadlaq, Aseel ;
Lee, Heung-No .
SENSORS, 2022, 22 (06)
[3]  
Aouani Hadhami, 2020, Procedia Computer Science, V176, P251, DOI 10.1016/j.procs.2020.08.027
[4]   An Overview of Audio Event Detection Methods from Feature Extraction to Classification [J].
Babaee, Elham ;
Anuar, Nor Badrul ;
Wahab, Ainuddin Wahid Abdul ;
Shamshirband, Shahaboddin ;
Chronopoulos, Anthony T. .
APPLIED ARTIFICIAL INTELLIGENCE, 2017, 31 (9-10) :661-714
[5]   A review on speech processing using machine learning paradigm [J].
Bhangale, Kishor Barasu ;
Mohanaprasad, K. .
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (02) :367-388
[6]  
Dara Suresh, 2018, 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA), P1795, DOI 10.1109/ICECA.2018.8474912
[7]   A Survey of Deep Learning and Its Applications: A New Paradigm to Machine Learning [J].
Dargan, Shaveta ;
Kumar, Munish ;
Ayyagari, Maruthi Rohit ;
Kumar, Gulshan .
ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING, 2020, 27 (04) :1071-1092
[8]   A Survey on Machine Learning Approaches for Automatic Detection of Voice Disorders [J].
Hegde, Sarika ;
Shetty, Surendra ;
Rai, Smitha ;
Dodderi, Thejaswi .
JOURNAL OF VOICE, 2019, 33 (06) :947.e11-947.e33
[9]   1D-local binary pattern based feature extraction for classification of epileptic EEG signals [J].
Kaya, Yilmaz ;
Uyar, Murat ;
Tekin, Ramazan ;
Yildirim, Selcuk .
APPLIED MATHEMATICS AND COMPUTATION, 2014, 243 :209-219
[10]   Speech Emotion Recognition Using Deep Learning Techniques: A Review [J].
Khalil, Ruhul Amin ;
Jones, Edward ;
Babar, Mohammad Inayatullah ;
Jan, Tariqullah ;
Zafar, Mohammad Haseeb ;
Alhussain, Thamer .
IEEE ACCESS, 2019, 7 :117327-117345