DEEP NEURAL NETWORK BASED POSTERIORS FOR TEXT-DEPENDENT SPEAKER VERIFICATION

被引:0
|
作者
Dey, Subhadeep [1 ,2 ]
Madikeri, Srikanth [1 ]
Ferras, Marc [1 ]
Modicek, Petr [1 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
来源
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS | 2016年
关键词
Text-dependent speaker verification; DNN posterior; Dynamic Time Warping;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The i-vector and Joint Factor Analysis (JFA) systems for text-dependent speaker verification use sufficient statistics computed from a speech utterance to estimate speaker models. These statistics average the acoustic information over the utterance thereby losing all the sequence information. In this paper, we study explicit content matching using Dynamic Time Warping (DTW) and present the best achievable error rates for speaker-dependent and speaker-independent content matching. For this purpose, a Deep Neural Network/Hidden Markov Model Automatic Speech Recognition (DNN/HMM ASR) system is used to extract content-related posterior probabilities. This approach outperforms systems using Gaussian mixture model posteriors by at least 50% Equal Error Rate (EER) on the RSR2015 in content mismatch trials. DNN posteriors are also used in i-vector and JFA systems, obtaining EERs as low as 0.02%.
引用
收藏
页码:5050 / 5054
页数:5
相关论文
共 50 条
  • [31] End-to-End Text-Dependent Speaker Verification
    Heigold, Georg
    Moreno, Ignacio
    Bengio, Samy
    Shazeer, Noam
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5115 - 5119
  • [32] MODELLING THE ALTERNATIVE HYPOTHESIS FOR TEXT-DEPENDENT SPEAKER VERIFICATION
    Larcher, Anthony
    Lee, Kong Aik
    Ma, Bin
    Li, Haizhou
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [33] On Residual CNN in Text-Dependent Speaker Verification Task
    Malykh, Egor
    Novoselov, Sergey
    Kudashev, Oleg
    SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 593 - 601
  • [34] DNN BASED SPEAKER EMBEDDING USING CONTENT INFORMATION FOR TEXT-DEPENDENT SPEAKER VERIFICATION
    Dey, Subhadeep
    Koshinaka, Takafumi
    Motlicek, Petr
    Madikeri, Srikanth
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5344 - 5348
  • [35] Constrained temporal structure for text-dependent speaker verification
    Larcher, Anthony
    Bonastre, Jean-Francois
    Mason, John S. D.
    DIGITAL SIGNAL PROCESSING, 2013, 23 (06) : 1910 - 1917
  • [36] Text-dependent speaker verification based on i-vectors, Neural Networks and Hidden Markov Models
    Zeinali, Hossein
    Sameti, Hossein
    Burget, Lukas
    Cernocky, Jan Honza
    COMPUTER SPEECH AND LANGUAGE, 2017, 46 : 53 - 71
  • [37] On Training Targets and Activation Functions for Deep Representation Learning in Text-Dependent Speaker Verification
    Sarkar, Achintya Kumar
    Tan, Zheng-Hua
    ACOUSTICS, 2023, 5 (03): : 693 - 713
  • [38] TEXT-DEPENDENT GMM-JFA SYSTEM FOR PASSWORD BASED SPEAKER VERIFICATION
    Novoselov, Sergey
    Pekhovsky, Timur
    Shulipa, Andrey
    Sholokhov, Alexey
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [39] Voice Transformation-based Spoofing of Text-Dependent Speaker Verification Systems
    Kons, Zvi
    Aronowitz, Hagai
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 945 - 949
  • [40] Improved Deep Speaker Feature Learning for Text-Dependent Speaker Recognition
    Li, Lantian
    Lin, Yiye
    Zhang, Zhiyong
    Wang, Dong
    2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 426 - 429