Learnable MFCCs for Speaker Verification

被引:5
|
作者
Liu, Xuechen [1 ,2 ]
Sahidullah, Md [2 ]
Kinnunen, Tomi [1 ]
机构
[1] Univ Eastern Finland, Sch Comp, Joensuu, Finland
[2] Univ Lorraine, CNRS, INRIA, LORIA, F-54000 Nancy, France
来源
2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS) | 2021年
基金
芬兰科学院;
关键词
Speaker verification; feature extraction; mel-frequency cesptral coefficients (MFCCs); RECOGNITION; FEATURES;
D O I
10.1109/ISCAS51556.2021.9401593
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We propose a learnable mel-frequency cepstral coefficients (MFCCs) front-end architecture for deep neural network (DNN) based automatic speaker verification. Our architecture retains the simplicity and interpretability of MFCC-based features while allowing the model to be adapted to data flexibly. In practice, we formulate data-driven version of four linear transforms in a standard MFCC extractor - windowing, discrete Fourier transform (DFT), mel filterbank and discrete cosine transform (DCT). Results reported reach up to 6.7% (VoxCeleb1) and 9.7% (SITW) relative improvement in term of equal error rate (EER) from static MFCCs, without additional tuning effort.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] Adversarial Reweighting for Speaker Verification Fairness
    Jin, Minho
    Ju, Chelsea J-T
    Chen, Zeya
    Liu, Yi-Chieh
    Droppo, Jasha
    Stolcke, Andreas
    INTERSPEECH 2022, 2022, : 4800 - 4804
  • [22] Influence of Corpus Size on Speaker Verification
    Dustor, Adam
    Klosowski, Piotr
    Izydorczyk, Jacek
    Kopanski, Rafal
    COMPUTER NETWORKS, CN 2015, 2015, 522 : 242 - 249
  • [23] Sparse Classifier Fusion for Speaker Verification
    Hautamaki, Ville
    Kinnunen, Tomi
    Sedlak, Filip
    Lee, Kong Aik
    Ma, Bin
    Li, Haizhou
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (08): : 1622 - 1631
  • [24] Session Variability in Automatic Speaker Verification
    Hayet, Djellali
    Radia, Amirouche
    Akila, Djebbar
    Tayeb, Laskri Mohamed
    2014 INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS (ICMCS), 2014, : 185 - 190
  • [25] Sequential Model Adaptation for Speaker Verification
    Wang, Jun
    Wang, Dong
    Wu, Xiaojun
    Zheng, Thomas Fang
    Tejedor, Javier
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2459 - 2463
  • [26] A Unified Framework for Speaker and Utterance Verification
    Liu, Tianchi
    Madhavi, Maulik
    Das, Rohan Kumar
    Li, Haizhou
    INTERSPEECH 2019, 2019, : 4320 - 4324
  • [27] A Spectrum Smoothing Method for Speaker Verification
    Zhang, Zhaofeng
    Deng, Jing
    Wang, Longbiao
    Xiao, Xiong
    2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 1291 - 1295
  • [28] LOG SPECTRA ENHANCEMENT USING SPEAKER DEPENDENT PRIORS FOR SPEAKER VERIFICATION
    Maina, Ciira Wa
    Walsh, John MacLaren
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4540 - 4543
  • [29] Deep domain adaptation for anti-spoofing in speaker verification systems
    Himawan, Ivan
    Villavicencio, Fernando
    Sridharan, Sridha
    Fookes, Clinton
    COMPUTER SPEECH AND LANGUAGE, 2019, 58 : 377 - 402
  • [30] HIGH IMPROVEMENT OF SPEAKER IDENTIFICATION AND VERIFICATION BY COMBINING MFCC AND PHASE INFORMATION
    Wang, Longbiao
    Ohtsuka, Shinji
    Nakagawa, Seiichi
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4529 - +