Comparisons of extreme learning machine and backpropagation-based i-vector approach for speaker identification

被引:6
作者
Al-Kaltakchi, Musab T. S. [1 ]
Al-Nima, Raid R. O. [2 ]
Abdullah, Mohammed A. M. [3 ]
机构
[1] Mustansiriyah Univ, Coll Engn, Dept Elect Engn, Baghdad, Iraq
[2] Northern Tech Univ, Tech Engn Coll Mosul, Mosul, Iraq
[3] Ninevah Univ, Coll Elect Engn, Dept Comp & Informat Engn, Mosul, Iraq
关键词
Speaker recognition; extreme learning machine; TIMIT database; i-vector; TRENDS;
D O I
10.3906/elk-1906-118
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The extreme learning machine (ELM) is one of the machine learning applications used for regression and classification systems. In this paper, an extended comparison between an ELM and the backpropagation neural network (BPNN)-based i-vector is given in terms of a closed-set speaker identification task using 120 speakers from the TIMIT database. The system is composed of the mel frequency cepstal coefficient (MFCC) and power normalized cepstal coefficient (PNCC) approaches to form the feature extraction stage, while the cepstral mean variance normalization (CMVN) and feature warping are applied in order to mitigate the linear channel effect. The system is utilized with equal numbers of speakers of both genders with 120 speakers with eight dialects from the TIMIT database. The results demonstrate that the combination of the i-vector with the ELM for different features has the highest speaker identification accuracy (SIA) compared with the combination of the BPNN with the i-vector. The results also show that the i-vector with ELM approach is faster than the BPNN-based i-vector and it has the highest SIA.
引用
收藏
页码:1236 / 1245
页数:10
相关论文
共 17 条
[1]  
Al-Kaltakchi MTS, 2017, PROCEEDINGS OF THE 2017 INTELLIGENT SYSTEMS CONFERENCE (INTELLISYS), P1141, DOI 10.1109/IntelliSys.2017.8324273
[2]  
Al-Kaltakchi MTS, 2017, EUR SIGNAL PR CONF, P533, DOI 10.23919/EUSIPCO.2017.8081264
[3]  
Albadra M.A.A., 2017, INT J APPL ENG RES, V12, P4610, DOI DOI 10.37622/000000
[4]  
[Anonymous], 2013, MSR identity toolbox-a Matlab toolbox for speaker recognition research
[5]  
[Anonymous], 2017, Proceedings of 2017 international conference on information, communication, instrumentation and control (ICICIC)
[6]  
author Gopi E.S., 2014, Digital Speech Processing Using Matlab
[7]   Extreme Learning Machines [J].
Cambria, Erik ;
Huang, Guang-Bin .
IEEE INTELLIGENT SYSTEMS, 2013, 28 (06) :30-31
[8]   Front-End Factor Analysis for Speaker Verification [J].
Dehak, Najim ;
Kenny, Patrick J. ;
Dehak, Reda ;
Dumouchel, Pierre ;
Ouellet, Pierre .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04) :788-798
[9]  
Dhanani J, 2019, BACK PROPAGATED NEUR
[10]   Extreme learning machine: algorithm, theory and applications [J].
Ding, Shifei ;
Zhao, Han ;
Zhang, Yanan ;
Xu, Xinzheng ;
Nie, Ru .
ARTIFICIAL INTELLIGENCE REVIEW, 2015, 44 (01) :103-115