Multilingual Speaker Recognition using Mel-frequency Cepstral Coefficients and Gaussian Mixture Model

被引:0
作者
Rahul, Mayur [1 ]
Jha, Sonu Kumar [2 ]
Prakash, Ayushi [3 ]
Verma, Sarvachan [3 ]
Yadav, Vikash [4 ]
机构
[1] CSJM Univ, Dept Comp Applicat, Kanpur, Uttar Pradesh, India
[2] Galgotias Univ, Greater Noida, Uttar Pradesh, India
[3] Ajay Kumar Garg Engn Coll, Ghaziabad, Uttar Pradesh, India
[4] Govt Polytech Bighapur Unnao, Dept Tech Educ, Manpur, Uttar Pradesh, India
关键词
Multilingual speaker recognition; MFCC; GMM; TIMIT; biometrics; ASR model; DEEP NEURAL-NETWORK;
D O I
10.2174/0123520965280852231212041006
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Introduction People can recognize a speaker with the help of their voice via mobile or digital devices.Methods To obtain this congenital human being ability, authentication techniques based on speaker biometrics like automated speaker recognition (ASR) have been proposed. An ASR identifies speakers by speech signals analysis and salient feature extraction from their voices.Results This will become an important part of recent research in the voice biometrics field. This paper proposes multilingual speaker recognition with the help of MFCC as feature extraction and GMM as classification techniques using various available datasets such as TIMIT, librespeech, etc.Conclusion The results achieved from the given datasets enhance the recognition rate of 70.98% with MFCC.
引用
收藏
页码:637 / 643
页数:7
相关论文
共 33 条
[1]  
B K P., 2022 International Conference on Wireless Communications Signal Processing and Networking (WiSPNET), DOI [10.1109/WiSPNET54241.2022.9767127, DOI 10.1109/WISPNET54241.2022.9767127]
[2]  
Cai W., 2018, arXiv, V2018
[3]   Fusing MFCC and LPC Features Using 1D Triplet CNN for Speaker Recognition in Severely Degraded Audio Signals [J].
Chowdhury, Anurag ;
Ross, Arun .
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2020, 15 :1616-1629
[4]  
Chung JS, 2018, INTERSPEECH, P1086
[5]  
Ghoshal A, 2013, INT CONF ACOUST SPEE, P7319, DOI 10.1109/ICASSP.2013.6639084
[6]  
Graves A, 2013, 2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), P273, DOI 10.1109/ASRU.2013.6707742
[7]  
Hajibabaei M., 2018, arXiv
[8]  
Hasan M.R., 2004, 3 INT C EL COMP ENG
[9]  
Heigold G, 2013, INT CONF ACOUST SPEE, P8619, DOI 10.1109/ICASSP.2013.6639348
[10]  
Huang JT, 2013, INT CONF ACOUST SPEE, P7304, DOI 10.1109/ICASSP.2013.6639081