ELM speaker identification for limited dataset using multitaper based MFCC and PNCC features with fusion score

被引:11
|
作者
Bharath, K. P. [1 ]
Kumar, Rajesh M. [1 ]
机构
[1] Vellore Inst Technol, Sch Elect Engn, Vellore, Tamil Nadu, India
关键词
Multitaper; MFCC; PNCC; Frequency warping; CMVN; EXTREME LEARNING-MACHINE; I-VECTOR; CONTINUOUS AUTHENTICATION; RECOGNITION; SPEECH; COMPENSATION;
D O I
10.1007/s11042-020-09353-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In current scenario, speaker recognition under noisy condition is the major challenging task in the area of speech processing. Due to noise environment there is a significant degradation in the system performance. The major aim of the proposed work is to identify the speaker's under clean and noise background using limited dataset. In this paper, we proposed a multitaper based Mel frequency cepstral coefficients (MFCC) and power normalization cepstral coefficients (PNCC) techniques with fusion strategies. Here, we used MFCC and PNCC techniques with different multitapers to extract the desired features from the obtained speech samples. Then, cepstral mean and variance normalization (CMVN) and Feature warping (FW) are the two techniques applied to normalize the obtained features from both the techniques. Furthermore, as a system model low dimension i-vector model is used and also different fusion score strategies like mean, maximum, weighted sum, cumulative and concatenated fusion techniques are utilized. Finally extreme learning machine (ELM) is used for classification in order to increase the system identification accuracy (SIA) intern which is having a single layer feedforward neural network with less complexity and time consuming compared to other neural networks. TIMIT and SITW 2016 are the two different databases are used to evaluate the proposed system under limited data of these databases. Both clean and noisy backgrounds conditions are used to check the SIA.
引用
收藏
页码:28859 / 28883
页数:25
相关论文
共 50 条
  • [21] STUDY OF STATISTICAL ROBUST CLOSED SET SPEAKER IDENTIFICATION WITH FEATURE AND SCORE-BASED FUSION
    Al-Kaltakchi, Musab T. S.
    Woo, Wai L.
    Dlay, Satnam S.
    Chambers, Jonathon A.
    2016 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP (SSP), 2016,
  • [22] i-Vector-Based Speaker Verification on Limited Data Using Fusion Techniques
    Kumari, T. R. Jayanthi
    Jayanna, H. S.
    JOURNAL OF INTELLIGENT SYSTEMS, 2020, 29 (01) : 565 - 582
  • [23] Ensemble Based Speaker Verification Using Adapted Score Fusion in Noisy Reverberant Environments
    Nakanishi, Ryosuke
    Shiota, Sayaka
    Kiya, Hitoshi
    2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
  • [24] Human identification after plastic surgery using region based score level fusion of local facial features
    Sabharwal, Tanupreet
    Gupta, Rashmi
    JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2019, 48
  • [25] Decision Fusion Using Similarity-weighted JCR and Mid-level Features based ELM for Hyperspectral Image Classification with Limited Training Samples
    Liu, Shuai
    Gao, Mulan
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2022, 43 (03) : 873 - 893
  • [26] A lazy learning-based language identification from speech using MFCC-2 features
    Himadri Mukherjee
    Sk Md Obaidullah
    K. C. Santosh
    Santanu Phadikar
    Kaushik Roy
    International Journal of Machine Learning and Cybernetics, 2020, 11 : 1 - 14
  • [27] Cardiac sound classification using a hybrid approach: MFCC-based feature fusion and CNN deep features
    Bahreini, Mahbubeh
    Barati, Ramin
    Kamali, Abbas
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2025, 2025 (01):
  • [28] Improving the Performance of Speaker Verification Systems under Noisy Conditions using Low Level Features and Score Level Fusion
    Asbai, Nassim
    Bengherabi, Messaoud
    Harizi, Farid
    Amrouche, Abderrahmane
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA APPLICATIONS (SIGMAP 2013), 2013, : 30 - 35
  • [29] EEG based direct speech BCI system using a fusion of SMRT and MFCC/ LPCC features with ANN classifier
    Mini, P. P.
    Thomas, Tessamma
    Gopikakumari, R.
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2021, 68
  • [30] A lazy learning-based language identification from speech using MFCC-2 features
    Mukherjee, Himadri
    Obaidullah, Sk Md
    Santosh, K. C.
    Phadikar, Santanu
    Roy, Kaushik
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2020, 11 (01) : 1 - 14