Learnable MFCCs for Speaker Verification

被引:5
|
作者
Liu, Xuechen [1 ,2 ]
Sahidullah, Md [2 ]
Kinnunen, Tomi [1 ]
机构
[1] Univ Eastern Finland, Sch Comp, Joensuu, Finland
[2] Univ Lorraine, CNRS, INRIA, LORIA, F-54000 Nancy, France
来源
2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS) | 2021年
基金
芬兰科学院;
关键词
Speaker verification; feature extraction; mel-frequency cesptral coefficients (MFCCs); RECOGNITION; FEATURES;
D O I
10.1109/ISCAS51556.2021.9401593
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We propose a learnable mel-frequency cepstral coefficients (MFCCs) front-end architecture for deep neural network (DNN) based automatic speaker verification. Our architecture retains the simplicity and interpretability of MFCC-based features while allowing the model to be adapted to data flexibly. In practice, we formulate data-driven version of four linear transforms in a standard MFCC extractor - windowing, discrete Fourier transform (DFT), mel filterbank and discrete cosine transform (DCT). Results reported reach up to 6.7% (VoxCeleb1) and 9.7% (SITW) relative improvement in term of equal error rate (EER) from static MFCCs, without additional tuning effort.
引用
收藏
页数:5
相关论文
共 50 条
  • [41] Speaker Verification based on extraction of Deep Features
    Mitsianis, Evangelos
    Spyrou, Evaggelos
    Giannakopoulos, Theodore
    10TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE (SETN 2018), 2018,
  • [42] Implementation of Text Dependent Speaker Verification on MATLAB
    Kaur, Gurpreet
    Kumar, Naresh
    Khanna, Ravinder
    Kumar, Amod
    2015 2ND INTERNATIONAL CONFERENCE ON RECENT ADVANCES IN ENGINEERING & COMPUTATIONAL SCIENCES (RAECS), 2015,
  • [43] Adaptive Margin Circle Loss for Speaker Verification
    Xiao, Runqiu
    Miao, Xiaoxiao
    Wang, Wenchao
    Zhang, Pengyuan
    Cai, Bin
    Luo, Liuping
    INTERSPEECH 2021, 2021, : 4618 - 4622
  • [44] Optimal Impostor Model in Automatic Speaker Verification
    Djellali, Hayet
    Laskri, Mohamed Tayeb
    PROCEEDINGS OF 2012 INTERNATIONAL CONFERENCE ON COMPLEX SYSTEMS (ICCS12), 2012, : 545 - 550
  • [45] PLDA inspired Siamese networks for speaker verification
    Ramoji, Shreyas
    Krishnan, Prashant
    Ganapathy, Sriram
    COMPUTER SPEECH AND LANGUAGE, 2022, 76
  • [46] Attentive Feature Fusion for Robust Speaker Verification
    Liu, Bei
    Chen, Zhengyang
    Qian, Yanmin
    INTERSPEECH 2022, 2022, : 286 - 290
  • [47] Speaker Verification based on Comparing Normalized Spectrograms
    Leu, Jia-Guu
    Geeng, Liang-tsair
    Pu, Chang En
    Shiau, Jyh-Bin
    2011 IEEE INTERNATIONAL CARNAHAN CONFERENCE ON SECURITY TECHNOLOGY (ICCST), 2011,
  • [48] Quality measures for speaker verification with short utterances
    Poddar, Arnab
    Sahidullah, Md
    Saha, Goutam
    DIGITAL SIGNAL PROCESSING, 2019, 88 : 66 - 79
  • [49] A Robust SVM/GMM Classifier for Speaker Verification
    Cirovic, Zoran
    Cirovic, Natasa
    SPEECH AND COMPUTER, 2014, 8773 : 74 - 80
  • [50] Score-Aging Calibration for Speaker Verification
    Kelly, Finnian
    Hansen, John H. L.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (12) : 2414 - 2424