ASSERT: Anti-Spoofing with Squeeze-Excitation and Residual neTworks

被引:81
作者
Lai, Cheng-, I [1 ]
Chen, Nanxin [1 ]
Villalba, Jesus [1 ]
Dehak, Najim [1 ]
机构
[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
来源
INTERSPEECH 2019 | 2019年
关键词
ASVspoof; Anti-Spoofing; Speaker Verification; COUNTERMEASURES; FEATURES;
D O I
10.21437/Interspeech.2019-1794
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
We present JHU's system submission to the ASVspoof 2019 Challenge: Anti-Spoofing with Squeeze-Excitation and Residual neTworks (ASSERT). Anti-spoofing has gathered more and more attention since the inauguration of the ASVspoof Challenges, and ASVspoof 2019 dedicates to address attacks from all three major types: text-to-speech, voice conversion, and replay. Built upon previous research work on Deep Neural Network (DNN), ASSERT is a pipeline for DNN-based approach to anti-spoofing. ASSERT has four components: feature engineering, DNN models, network optimization and system combination, where the DNN models are variants of squeeze-excitation and residual networks. We conducted an ablation study of the effectiveness of each component on the ASVspoof 2019 corpus, and experimental results showed that ASSERT obtained more than 93% and 17% relative improvements over the baseline systems in the two sub-challenges in ASVspooof 2019, ranking ASSERT one of the top performing systems. Code and pretrained models are made publicly available(1).
引用
收藏
页码:1013 / 1017
页数:5
相关论文
共 37 条
[1]  
Adiban M., 2017, P 29 C COMP LING SPE, P264
[2]  
[Anonymous], 2018, Proc. Odyssey 2018 The Speaker and Language Recognition Workshop
[3]  
[Anonymous], 2012, P 2012 AS PAC SIGN I
[4]  
[Anonymous], 2018, 2018 IEEE INT C AC S
[5]  
[Anonymous], 2013, ARXIV13042865
[6]  
Cai WC, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), P5189, DOI 10.1109/ICASSP.2018.8462025
[7]  
Chen N., 2015, 16 ANN C INT SPEECH
[8]   ResNet and Model Fusion for Automatic Spoofing Detection [J].
Chen, Zhuxin ;
Xie, Zhifeng ;
Zhang, Weibin ;
Xu, Xiangmin .
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, :102-106
[9]  
Chettri B., 2018, ARXIV180509164
[10]   Front-End Factor Analysis for Speaker Verification [J].
Dehak, Najim ;
Kenny, Patrick J. ;
Dehak, Reda ;
Dumouchel, Pierre ;
Ouellet, Pierre .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04) :788-798