Deep Siamese Architecture Based Replay Detection for Secure Voice Biometric

被引:33
作者
Sriskandaraja, Kaavya [1 ,2 ]
Sethu, Vidhyasaharan [1 ]
Ambikairajah, Eliathamby [1 ,2 ]
机构
[1] UNSW Australia, Sch Elect Engn & Telecommun, Sydney, NSW, Australia
[2] CSIRO, DATA61, Sydney, NSW, Australia
来源
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年
关键词
voice biometrics; anti-spoofing; Siamese; deep learning; speech recognition; human-computer interaction; SPEAKER VERIFICATION; SPOOFING DETECTION; COUNTERMEASURES; ATTACK;
D O I
10.21437/Interspeech.2018-1819
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Replay attacks are the simplest and the most easily accessible form of spoofing attacks on voice biometric systems and can be hard to detect by systems designed to identify spoofing attacks based on synthesised speech. In this paper, we propose a novel approach to evaluate the similarities between pairs of speech samples to detect replayed speech based on a suitable embedding learned by deep Siamese architectures. Specifically, we train a deep.Siamese network to identify pairs of genuine speech samples and pairs of replayed speech samples as being 'similar' and mixed pairs of genuine and replayed speech to be identified as 'dissimilar'. Siamese networks are particularly suited to this task and have been shown to be effective in problems where intra-class variability is large and the number of training samples per class is relatively small. The internal low-dimensional embedding learnt by the Siamese network to accomplish this task is then used as the basis for replay detection. The proposed approach outperforms state-of-the-art systems when evaluated on the ASVspoof 2017 challenge corpus without relying on fusion with other sub-systems.
引用
收藏
页码:671 / 675
页数:5
相关论文
共 39 条
  • [1] Text-Dependent Audiovisual Synchrony Detection for Spoofing Detection in Mobile Person Recognition
    Aides, Amit
    Aronowitz, Hagai
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2125 - 2129
  • [2] [Anonymous], 2015, PROC CVPR IEEE, DOI 10.1109/CVPR.2015.7299016
  • [3] [Anonymous], 2016, SPOOFING DETECTION N
  • [4] Appalaraju Srikar, 2017, ARXIV170908761
  • [5] Berlemont S., 2015, WORK BASED SIMILARIT, P1
  • [6] Bromley J., 1993, International Journal of Pattern Recognition and Artificial Intelligence, V7, P669, DOI 10.1142/S0218001493000339
  • [7] Countermeasures for Automatic Speaker Verification Replay Spoofing Attack : On Data Augmentation, Feature Representation, Classification and Fusion
    Cai, Weicheng
    Cai, Danwei
    Liu, Wenbo
    Li, Gang
    Li, Ming
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 17 - 21
  • [8] Impaired Fasting Glucose Association With Mortality in Nondiabetic Patients on Maintenance Peritoneal Dialysis
    Chen, Kuan-Hsing
    Lin, Ja-Liang
    Hung, Cheng-Chieh
    Lin-Tan, Dan-Tzu
    Weng, Shu-Man
    Yen, Tzung-Hai
    Hsu, Ching-Wei
    Yang, Chih-Wei
    [J]. AMERICAN JOURNAL OF THE MEDICAL SCIENCES, 2011, 341 (04) : 312 - 317
  • [9] Dinkel H, 2017, INT CONF ACOUST SPEE, P4860, DOI 10.1109/ICASSP.2017.7953080
  • [10] Du W., 2016, Siamese Convolutional Neural Networks for Authorship Verification