Multilingual Speaker Identification by Combining Evidence from LPR and Multitaper MFCC

被引:2
|
作者
Nagaraja, B. [1 ]
Jayanna, H. [1 ]
机构
[1] Siddaganga Inst Technol, Dept Informat Sci & Engn, Tumkur 572103, Karnataka, India
关键词
Speaker identification; mel-frequency cepstral coefficients; multitaper mel-frequency cepstral coefficients; multilingual; linear prediction residual; linear prediction residual phase;
D O I
10.1515/jisys-2013-0038
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, the significance of combining the evidence from multitaper mel-frequency cepstral coefficients (MFCC), linear prediction residual (LPR), and linear prediction residual phase (LPRP) features for multilingual speaker identification with the constraint of limited data condition is demonstrated. The LPR is derived from linear prediction analysis, and LPRP is obtained by dividing the LPR using its Hilbert envelope. The sine-weighted cepstrum estimators (SWCE) with six tapers are considered for multitaper MFCC feature extraction. The Gaussian mixture model-universal background model is used for modeling each speaker for different evidence. The evidence is then combined at scoring level to improve the performance. The monolingual, crosslingual, and multilingual speaker identification studies were conducted using 30 randomly selected speakers from the IITG multivariability speaker recognition database. The experimental results show that the combined evidence improves the performance by nearly 8-10% compared with individual evidence.
引用
收藏
页码:241 / 251
页数:11
相关论文
共 50 条
  • [1] Multitaper MFCC and normalized multitaper phase-based features for speaker verification
    Mansouri, Arash
    Castillo-Guerra, Eduardo
    SN APPLIED SCIENCES, 2019, 1 (04):
  • [2] Multitaper MFCC and normalized multitaper phase-based features for speaker verification
    Arash Mansouri
    Eduardo Castillo-Guerra
    SN Applied Sciences, 2019, 1
  • [3] Combining evidence from residual phase and MFCC features for speaker recognition
    Murty, KR
    Yegnanarayana, B
    IEEE SIGNAL PROCESSING LETTERS, 2006, 13 (01) : 52 - 55
  • [4] Speaker Identification and Verification by Combining MFCC and Phase Information
    Nakagawa, Seiichi
    Wang, Longbiao
    Ohtsuka, Shinji
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (04): : 1085 - 1095
  • [5] Speaker Identification and Verification of Noisy Speech Using Multitaper MFCC and Gaussian Mixture Models
    Veena, K. V.
    Mathew, Dominic
    PROCEEDINGS OF 2015 IEEE INTERNATIONAL CONFERENCE ON POWER, INSTRUMENTATION, CONTROL AND COMPUTING (PICC), 2015,
  • [6] SPEAKER IDENTIFICATION BY COMBINING MFCC AND PHASE INFORMATION IN NOISY ENVIRONMENTS
    Wang, Longbiao
    Minami, Kazue
    Yamamoto, Kazumasa
    Nakagawa, Seiichi
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4502 - 4505
  • [7] Multitaper Based MFCC Feature Extraction for Robust Speaker Recognition System
    Bharath, K. P.
    Kumar, Rajesh M.
    2019 INNOVATIONS IN POWER AND ADVANCED COMPUTING TECHNOLOGIES (I-PACT), 2019,
  • [8] Text-Independent Speaker Identification by Combining MFCC and MVA Features
    Korba, Mohamed Cherif Amara
    Bourouba, Houcine
    Rafik, Djemili
    2018 INTERNATIONAL CONFERENCE ON SIGNAL, IMAGE, VISION AND THEIR APPLICATIONS (SIVA), 2018,
  • [9] ELM speaker identification for limited dataset using multitaper based MFCC and PNCC features with fusion score
    Bharath, K. P.
    Kumar, Rajesh M.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (39-40) : 28859 - 28883
  • [10] ELM speaker identification for limited dataset using multitaper based MFCC and PNCC features with fusion score
    Bharath K P
    Rajesh Kumar M
    Multimedia Tools and Applications, 2020, 79 : 28859 - 28883