Learnable MFCCs for Speaker Verification

被引：5

作者：

Liu, Xuechen ^{[1
,2
]}

Sahidullah, Md ^{[2
]}

Kinnunen, Tomi ^{[1
]}

机构：

[1] Univ Eastern Finland, Sch Comp, Joensuu, Finland

[2] Univ Lorraine, CNRS, INRIA, LORIA, F-54000 Nancy, France

来源：

2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS) | 2021年

基金：

芬兰科学院;

关键词：

Speaker verification; feature extraction; mel-frequency cesptral coefficients (MFCCs); RECOGNITION; FEATURES;

D O I：

10.1109/ISCAS51556.2021.9401593

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

We propose a learnable mel-frequency cepstral coefficients (MFCCs) front-end architecture for deep neural network (DNN) based automatic speaker verification. Our architecture retains the simplicity and interpretability of MFCC-based features while allowing the model to be adapted to data flexibly. In practice, we formulate data-driven version of four linear transforms in a standard MFCC extractor - windowing, discrete Fourier transform (DFT), mel filterbank and discrete cosine transform (DCT). Results reported reach up to 6.7% (VoxCeleb1) and 9.7% (SITW) relative improvement in term of equal error rate (EER) from static MFCCs, without additional tuning effort.

引用

页数：5

共 50 条

[41] Speaker Verification based on extraction of Deep Features
Mitsianis, Evangelos
Spyrou, Evaggelos
Giannakopoulos, Theodore
10TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE (SETN 2018), 2018,
[42] Implementation of Text Dependent Speaker Verification on MATLAB
Kaur, Gurpreet
Kumar, Naresh
Khanna, Ravinder
Kumar, Amod
2015 2ND INTERNATIONAL CONFERENCE ON RECENT ADVANCES IN ENGINEERING & COMPUTATIONAL SCIENCES (RAECS), 2015,
[43] Adaptive Margin Circle Loss for Speaker Verification
Xiao, Runqiu
Miao, Xiaoxiao
Wang, Wenchao
Zhang, Pengyuan
Cai, Bin
Luo, Liuping
INTERSPEECH 2021, 2021, : 4618 - 4622
[44] Optimal Impostor Model in Automatic Speaker Verification
Djellali, Hayet
Laskri, Mohamed Tayeb
PROCEEDINGS OF 2012 INTERNATIONAL CONFERENCE ON COMPLEX SYSTEMS (ICCS12), 2012, : 545 - 550
[45] PLDA inspired Siamese networks for speaker verification
Ramoji, Shreyas
Krishnan, Prashant
Ganapathy, Sriram
COMPUTER SPEECH AND LANGUAGE, 2022, 76
[46] Attentive Feature Fusion for Robust Speaker Verification
Liu, Bei
Chen, Zhengyang
Qian, Yanmin
INTERSPEECH 2022, 2022, : 286 - 290
[47] Speaker Verification based on Comparing Normalized Spectrograms
Leu, Jia-Guu
Geeng, Liang-tsair
Pu, Chang En
Shiau, Jyh-Bin
2011 IEEE INTERNATIONAL CARNAHAN CONFERENCE ON SECURITY TECHNOLOGY (ICCST), 2011,
[48] Quality measures for speaker verification with short utterances
Poddar, Arnab
Sahidullah, Md
Saha, Goutam
DIGITAL SIGNAL PROCESSING, 2019, 88 : 66 - 79
[49] A Robust SVM/GMM Classifier for Speaker Verification
Cirovic, Zoran
Cirovic, Natasa
SPEECH AND COMPUTER, 2014, 8773 : 74 - 80
[50] Score-Aging Calibration for Speaker Verification
Kelly, Finnian
Hansen, John H. L.
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (12) : 2414 - 2424

← 1 2 3 4 5 →