Learnable MFCCs for Speaker Verification

被引:5
|
作者
Liu, Xuechen [1 ,2 ]
Sahidullah, Md [2 ]
Kinnunen, Tomi [1 ]
机构
[1] Univ Eastern Finland, Sch Comp, Joensuu, Finland
[2] Univ Lorraine, CNRS, INRIA, LORIA, F-54000 Nancy, France
来源
2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS) | 2021年
基金
芬兰科学院;
关键词
Speaker verification; feature extraction; mel-frequency cesptral coefficients (MFCCs); RECOGNITION; FEATURES;
D O I
10.1109/ISCAS51556.2021.9401593
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We propose a learnable mel-frequency cepstral coefficients (MFCCs) front-end architecture for deep neural network (DNN) based automatic speaker verification. Our architecture retains the simplicity and interpretability of MFCC-based features while allowing the model to be adapted to data flexibly. In practice, we formulate data-driven version of four linear transforms in a standard MFCC extractor - windowing, discrete Fourier transform (DFT), mel filterbank and discrete cosine transform (DCT). Results reported reach up to 6.7% (VoxCeleb1) and 9.7% (SITW) relative improvement in term of equal error rate (EER) from static MFCCs, without additional tuning effort.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] Emotional Speaker Verification Using Novel Modified Capsule Neural Network
    Nassif, Ali Bou
    Shahin, Ismail
    Nemmour, Nawel
    Hindawi, Noor
    Elnagar, Ashraf
    MATHEMATICS, 2023, 11 (02)
  • [32] Wavelet Packet Sub-band Cepstral Coefficient for Speaker Verification
    Min, Hang
    Wei, Guangcun
    Xu, Yunfei
    Zhang, Yanna
    2022 IEEE 6TH ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2022, : 1713 - 1717
  • [33] ADVERSARIAL SPEAKER VERIFICATION
    Meng, Zhong
    Zhao, Yong
    Li, Jinyu
    Gong, Yifan
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6216 - 6220
  • [34] DISENTANGLED SPEAKER EMBEDDING FOR ROBUST SPEAKER VERIFICATION
    Yi, Lu
    Mak, Man-Wai
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7662 - 7666
  • [35] Generalization Ability Improvement of Speaker Representation and Anti-Interference for Speaker Verification
    Hong, Qian-Bei
    Wu, Chung-Hsien
    Wang, Hsin-Min
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 486 - 499
  • [36] Improving speaker verification performance against long-term speaker variability
    Wang, Linlin
    Wang, Jun
    Li, Lantian
    Zheng, Thomas Fang
    Soong, Frank K.
    SPEECH COMMUNICATION, 2016, 79 : 14 - 29
  • [37] SPEAKER VERIFICATION BY INEXPERIENCED AND EXPERIENCED LISTENERS VS. SPEAKER VERIFICATION SYSTEM
    Kahn, Juliette
    Audibert, Nicolas
    Rossato, Solange
    Bonastre, Jean-Francois
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5912 - 5915
  • [38] A STUDY OF SPEAKER VERIFICATION PERFORMANCE WITH EXPRESSIVE SPEECH
    Parthasarathy, Srinivas
    Zhang, Chunlei
    Hansen, John H. L.
    Busso, Carlos
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5540 - 5544
  • [39] Evaluation ofMFCC for Speaker Verification on Various Windows
    Jain, Anjali
    Sharma, O. P.
    2014 RECENT ADVANCES AND INNOVATIONS IN ENGINEERING (ICRAIE), 2014,
  • [40] Acoustic Factor Analysis for Robust Speaker Verification
    Hasan, Taufiq
    Hansen, John H. L.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (04): : 842 - 853