ELM speaker identification for limited dataset using multitaper based MFCC and PNCC features with fusion score

被引:11
|
作者
Bharath, K. P. [1 ]
Kumar, Rajesh M. [1 ]
机构
[1] Vellore Inst Technol, Sch Elect Engn, Vellore, Tamil Nadu, India
关键词
Multitaper; MFCC; PNCC; Frequency warping; CMVN; EXTREME LEARNING-MACHINE; I-VECTOR; CONTINUOUS AUTHENTICATION; RECOGNITION; SPEECH; COMPENSATION;
D O I
10.1007/s11042-020-09353-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In current scenario, speaker recognition under noisy condition is the major challenging task in the area of speech processing. Due to noise environment there is a significant degradation in the system performance. The major aim of the proposed work is to identify the speaker's under clean and noise background using limited dataset. In this paper, we proposed a multitaper based Mel frequency cepstral coefficients (MFCC) and power normalization cepstral coefficients (PNCC) techniques with fusion strategies. Here, we used MFCC and PNCC techniques with different multitapers to extract the desired features from the obtained speech samples. Then, cepstral mean and variance normalization (CMVN) and Feature warping (FW) are the two techniques applied to normalize the obtained features from both the techniques. Furthermore, as a system model low dimension i-vector model is used and also different fusion score strategies like mean, maximum, weighted sum, cumulative and concatenated fusion techniques are utilized. Finally extreme learning machine (ELM) is used for classification in order to increase the system identification accuracy (SIA) intern which is having a single layer feedforward neural network with less complexity and time consuming compared to other neural networks. TIMIT and SITW 2016 are the two different databases are used to evaluate the proposed system under limited data of these databases. Both clean and noisy backgrounds conditions are used to check the SIA.
引用
收藏
页码:28859 / 28883
页数:25
相关论文
共 50 条
  • [31] Speaker age interval and sex identification based on Jitters, Shimmers and Mean MFCC using supervised and unsupervised discriminative classification methods
    Naini, A. Sadeghi
    Homayounpour, M. M.
    2006 8TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-4, 2006, : 684 - +
  • [32] Speaker identification using features based on first order Bessel function expansion of speech
    Gopalan, K
    Anderson, TR
    Cupples, EJ
    1997 IEEE PACIFIC RIM CONFERENCE ON COMMUNICATIONS, COMPUTERS AND SIGNAL PROCESSING, VOLS 1 AND 2: PACRIM 10 YEARS - 1987-1997, 1997, : 589 - 592
  • [33] Extracting Sub-glottal and Supra-glottal Features from MFCC using Convolutional Neural Networks for Speaker Identification in Degraded Audio Signals
    Chowdhury, Anurag
    Ross, Arun
    2017 IEEE INTERNATIONAL JOINT CONFERENCE ON BIOMETRICS (IJCB), 2017, : 608 - 617
  • [34] Ensemble classifier based source camera identification using fusion features
    Wang, Bo
    Zhong, Kun
    Li, Ming
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (07) : 8397 - 8422
  • [35] Ensemble classifier based source camera identification using fusion features
    Bo Wang
    Kun Zhong
    Ming Li
    Multimedia Tools and Applications, 2019, 78 : 8397 - 8422
  • [36] Voiceprint Identification for Limited Dataset Using the Deep Migration Hybrid Model Based on Transfer Learning
    Sun, Cunwei
    Yang, Yuxin
    Wen, Chang
    Xie, Kai
    Wen, Fangqing
    SENSORS, 2018, 18 (07)
  • [37] Score Information Decision Fusion Using Support Vector Machine for a Correlation Filter Based Speaker Authentication System
    Ramli, Dzati Athiar
    Samad, Salina Abdul
    Hussain, Aini
    PROCEEDINGS OF THE INTERNATIONAL WORKSHOP ON COMPUTATIONAL INTELLIGENCE IN SECURITY FOR INFORMATION SYSTEMS CISIS 2008, 2009, 53 : 235 - 242
  • [38] A comparison of speaker identification results using features based on cepstrum and Fourier-Bessel expansion
    Gopalan, K
    Anderson, TR
    Cupples, EJ
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1999, 7 (03): : 289 - 294
  • [39] Probabilistic graph-based feature fusion and score fusion using SIFT features for face and ear biometrics
    Kisku, Dakshina Ranjan
    Mehrotra, Hunny
    Gupta, Phalguni
    Sing, Jamuna Kanta
    APPLICATIONS OF DIGITAL IMAGE PROCESSING XXXII, 2009, 7443
  • [40] Plant Identification using score-based fusion of multi-organ images
    Thanh-Binh Do
    Huy-Hoang Nguyen
    Thi-Thanh-Nhan Nguyen
    Hai Vu
    Thi-Thanh-Hai Tran
    Thi-Lan Le
    2017 9TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE 2017), 2017, : 191 - 196