ELM speaker identification for limited dataset using multitaper based MFCC and PNCC features with fusion score

被引:11
|
作者
Bharath, K. P. [1 ]
Kumar, Rajesh M. [1 ]
机构
[1] Vellore Inst Technol, Sch Elect Engn, Vellore, Tamil Nadu, India
关键词
Multitaper; MFCC; PNCC; Frequency warping; CMVN; EXTREME LEARNING-MACHINE; I-VECTOR; CONTINUOUS AUTHENTICATION; RECOGNITION; SPEECH; COMPENSATION;
D O I
10.1007/s11042-020-09353-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In current scenario, speaker recognition under noisy condition is the major challenging task in the area of speech processing. Due to noise environment there is a significant degradation in the system performance. The major aim of the proposed work is to identify the speaker's under clean and noise background using limited dataset. In this paper, we proposed a multitaper based Mel frequency cepstral coefficients (MFCC) and power normalization cepstral coefficients (PNCC) techniques with fusion strategies. Here, we used MFCC and PNCC techniques with different multitapers to extract the desired features from the obtained speech samples. Then, cepstral mean and variance normalization (CMVN) and Feature warping (FW) are the two techniques applied to normalize the obtained features from both the techniques. Furthermore, as a system model low dimension i-vector model is used and also different fusion score strategies like mean, maximum, weighted sum, cumulative and concatenated fusion techniques are utilized. Finally extreme learning machine (ELM) is used for classification in order to increase the system identification accuracy (SIA) intern which is having a single layer feedforward neural network with less complexity and time consuming compared to other neural networks. TIMIT and SITW 2016 are the two different databases are used to evaluate the proposed system under limited data of these databases. Both clean and noisy backgrounds conditions are used to check the SIA.
引用
收藏
页码:28859 / 28883
页数:25
相关论文
共 50 条
  • [1] ELM speaker identification for limited dataset using multitaper based MFCC and PNCC features with fusion score
    Bharath K P
    Rajesh Kumar M
    Multimedia Tools and Applications, 2020, 79 : 28859 - 28883
  • [2] STUDY OF FUSION STRATEGIES AND EXPLOITING THE COMBINATION OF MFCC AND PNCC FEATURES FOR ROBUST BIOMETRIC SPEAKER IDENTIFICATION
    Al-Kaltakchi, M. T. S.
    Woo, W. L.
    Dlay, S. S.
    Chambers, J. A.
    2016 4TH INTERNATIONAL WORKSHOP ON BIOMETRICS AND FORENSICS (IWBF), 2016,
  • [3] Multitaper MFCC and normalized multitaper phase-based features for speaker verification
    Mansouri, Arash
    Castillo-Guerra, Eduardo
    SN APPLIED SCIENCES, 2019, 1 (04):
  • [4] Multitaper MFCC and normalized multitaper phase-based features for speaker verification
    Arash Mansouri
    Eduardo Castillo-Guerra
    SN Applied Sciences, 2019, 1
  • [5] Source Microphone Identification Using Multitaper MFCC Features
    Eskidere, Omer
    Karatutlu, Ali
    2015 9TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONICS ENGINEERING (ELECO), 2015, : 227 - 231
  • [6] Multitaper MFCC and PLP features for speaker verification using i-vectors
    Alam, Md Jahangir
    Kinnunen, Tomi
    Kenny, Patrick
    Ouellet, Pierre
    O'Shaughnessy, Douglas
    SPEECH COMMUNICATION, 2013, 55 (02) : 237 - 251
  • [7] Speaker Identification and Verification of Noisy Speech Using Multitaper MFCC and Gaussian Mixture Models
    Veena, K. V.
    Mathew, Dominic
    PROCEEDINGS OF 2015 IEEE INTERNATIONAL CONFERENCE ON POWER, INSTRUMENTATION, CONTROL AND COMPUTING (PICC), 2015,
  • [8] Speaker identification based on combination of MFCC and UMRT based features
    Antony, Anett
    Gopikakumari, R.
    8TH INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING & COMMUNICATIONS (ICACC-2018), 2018, 143 : 250 - 257
  • [9] A Speaker Identification System using MFCC Features with VQ Technique
    Zulfiqar, Ali
    Muhammad, Aslam
    Enriquez A M, Martinez
    2009 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, VOL 3, PROCEEDINGS, 2009, : 115 - +
  • [10] Speaker identification and localization using shuffled MFCC features and deep learning
    Barhoush M.
    Hallawa A.
    Schmeink A.
    International Journal of Speech Technology, 2023, 26 (01) : 185 - 196