PHYSIOLOGICALLY-MOTIVATED FEATURE EXTRACTION FOR SPEAKER IDENTIFICATION

被引:0
|
作者
Wang, Jianglin [1 ]
Johnson, Michael T. [1 ]
机构
[1] Marquette Univ, Dept Elect & Comp Engn, Speech & Signal Proc Lab, Milwaukee, WI 53233 USA
来源
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2014年
关键词
Speaker distinctive feature; Speaker identification; Glottal source excitation and GMM-UBM; VERIFICATION; PHASE; MFCC;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper introduces the use of three physiologically-motivated features for speaker identification, Residual Phase Cepstrum Coefficients (RPCC), Glottal Flow Cepstrum Coefficients (GLFCC) and Teager Phase Cepstrum Coefficients (TPCC). These features capture speaker-discriminative characteristics from different aspects of glottal source excitation patterns. The proposed physiologically-driven features give better results with lower model complexities, and also provide complementary information that can improve overall system performance even for larger amounts of data. Results on speaker identification using the YOHO corpus demonstrate that these physiologically-driven features are both more accurate than and complementary to traditional mel-frequency cepstral coefficients (MFCC). In particular, the incorporation of the proposed glottal source features offers significant overall improvement to the robustness and accuracy of speaker identification tasks.
引用
收藏
页数:5
相关论文
共 50 条
  • [41] The performance comparison of fitting feature with segment model in speaker identification
    Yu, CG
    Yang, YC
    Wu, ZH
    2003 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-5, CONFERENCE PROCEEDINGS, 2003, : 4216 - 4221
  • [42] ROBUST SPEAKER IDENTIFICATION USING AN AUDITORY-BASED FEATURE
    Li, Qi
    Huang, Yan
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4514 - 4517
  • [43] STATISTICAL FEATURE OF PITCH FREQUENCY DISTRIBUTIONS FOR OBUST SPEAKER IDENTIFICATION
    Zhang Linghua Zheng Baoyu Yang Zhen (Dept of Info. Eng.
    Journal of Electronics(China), 2005, (04) : 437 - 442
  • [44] Speaker Identification System Based on Lip-Motion Feature
    Ma, Xinjun
    Wu, Chenchen
    Li, Yuanyuan
    Zhong, Qianyuan
    COMPUTER VISION SYSTEMS, ICVS 2017, 2017, 10528 : 289 - 299
  • [45] Multitaper Based MFCC Feature Extraction for Robust Speaker Recognition System
    Bharath, K. P.
    Kumar, Rajesh M.
    2019 INNOVATIONS IN POWER AND ADVANCED COMPUTING TECHNOLOGIES (I-PACT), 2019,
  • [46] MFCC and Similarity Measurements for Speaker Identification Systems
    Maazouzi, A.
    Aqili, N.
    Aamoud, A.
    Raji, M.
    Hammouch, A.
    PROCEEDINGS OF 2017 INTERNATIONAL CONFERENCE ON ELECTRICAL AND INFORMATION TECHNOLOGIES (ICEIT 2017), 2017,
  • [47] Few-Shot Speaker Identification Using Lightweight Prototypical Network With Feature Grouping and Interaction
    Li, Yanxiong
    Chen, Hao
    Cao, Wenchang
    Huang, Qisheng
    He, Qianhua
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 9241 - 9253
  • [48] Enhancement in Speaker Identification through Feature Fusion using Advanced Dilated Convolution Neural Network
    Pentapati, Hema Kumar
    Sridevi, K.
    INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2023, 14 (03) : 301 - 310
  • [49] A modified speaker clustering method for efficient speaker identification
    Yan, JiaChang
    Wang, Lei
    2014 SEVENTH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID 2014), VOL 2, 2014,
  • [50] Neural network based feature transformation for emotion independent speaker identification
    Krothapalli, Sreenivasa
    Yadav, Jaynath
    Sarkar, Sourjya
    Koolagudi, Shashidhar
    Vuppala, Anil
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (03) : 335 - 349