Variant Time-Frequency Cepstral Features for Speaker Recognition

被引:0
|
作者
Zhang, Wei-Qiang [1 ]
Deng, Yan [1 ]
He, Liang [1 ]
Liu, Jia [1 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Tsinghua Natl Lab Informat Sci & Technol, Beijing 100084, Peoples R China
关键词
Speaker recognition (SRE); time-frequency cepstrum (TFC); IDENTIFICATION; MODELS;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In speaker recognition (SRE), the commonly used feature vector is basic ceptral coefficients concatenating with their delta and double delta cepstal features. This configuration is borrowed from speech recognition and may be not optimal for SRE. In this paper, we propose a variant time-frequency cepstral (TFC) features, which is based on our previous work for language recognition. The feature vector is obtained by performing a temporal discrete cosine transform (DCT) on the cepstrum matrix and selecting the transformed elements in a specific area with large variances. Different shapes and parameters are tested and the optimal configuration is obtained. Experimental results on the 2008 NIST speaker recognition evaluation short2 telephone-short3 telephone test set show that the proposed variant TFC is more effective than the conventional feature vectors.
引用
收藏
页码:2122 / 2125
页数:4
相关论文
共 50 条
  • [21] Low-variance Multitaper Mel-frequency Cepstral Coefficient Features for Speech and Speaker Recognition Systems
    Alam, Md. Jahangir
    Kenny, Patrick
    O'Shaughnessy, Douglas
    COGNITIVE COMPUTATION, 2013, 5 (04) : 533 - 544
  • [22] Speaker independent phoneme recognition based on fractal dimension (DF) and the mel-frequency cepstral coefficients features
    Fekkai, S
    Al-Akaidi, M
    Blackledge, JM
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 4014 - 4014
  • [23] Password secured speaker recognition using time and frequency domain features
    Prasad, K. Satya
    Sheela, K. Anitha
    Latha, M. Madhavi
    PROCEEDINGS OF THE FOURTH IASTED INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PATTERN RECOGNITION, AND APPLICATIONS, 2007, : 309 - +
  • [24] A CEPSTRAL BASED SPEAKER RECOGNITION SYSTEM
    SETHURAMAN, R
    GOWDY, JN
    PROCEEDINGS : THE TWENTY-FIRST SOUTHEASTERN SYMPOSIUM ON SYSTEM THEORY, 1989, : 503 - 507
  • [25] Text-Dependent Speaker Recognition by Efficient Capture of Speaker Dynamics in Compressed Time-Frequency Representations of Speech
    Das, Amitava
    Chittaranjan, Gokul
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1921 - 1924
  • [26] Real-Time Speaker Identification System using Cepstral Features
    Barik, Monalisha
    Sarangi, Susanta Kumar
    Sahu, Sushanta Kumar
    2016 2ND INTERNATIONAL CONFERENCE ON COMMUNICATION CONTROL AND INTELLIGENT SYSTEMS (CCIS), 2016, : 89 - 93
  • [27] Speaker Recognition Using Mel Frequency Cepstral Coefficient and Locality Sensitive Hashing
    Awais, Ahmed
    Kun, She
    Yu, Yue
    Hayat, Shaukat
    Ahmed, Aftab
    Tu, Tianyi
    2018 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD), 2018, : 271 - 276
  • [28] Lightweight Fusion Model with Time-Frequency Features for Speech Emotion Recognition
    Zhang, Peng
    Li, Meijuan
    Zhao, Hui
    Chen, Yida
    Wang, Fuqiang
    Li, Ye
    Zhao, Wei
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 3017 - 3022
  • [29] Semisupervised Deep Features of Time-Frequency Maps for Multimodal Emotion Recognition
    Zali-Vargahan, Behrooz
    Charmin, Asghar
    Kalbkhani, Hashem
    Barghandan, Saeed
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2023, 2023
  • [30] RECOGNITION OF EMOTIONS FROM TIME AND TIME-FREQUENCY FEATURES USING FACIAL ELECTROMYOGRAPHY SIGNALS
    Shiva J.
    Makaram N.
    Karthick P.A.
    Swaminathan R.
    Biomedical Sciences Instrumentation, 2021, 57 (03) : 386 - 391