DEEP NEURAL NETWORK BASED POSTERIORS FOR TEXT-DEPENDENT SPEAKER VERIFICATION

被引:0
|
作者
Dey, Subhadeep [1 ,2 ]
Madikeri, Srikanth [1 ]
Ferras, Marc [1 ]
Modicek, Petr [1 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
来源
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS | 2016年
关键词
Text-dependent speaker verification; DNN posterior; Dynamic Time Warping;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The i-vector and Joint Factor Analysis (JFA) systems for text-dependent speaker verification use sufficient statistics computed from a speech utterance to estimate speaker models. These statistics average the acoustic information over the utterance thereby losing all the sequence information. In this paper, we study explicit content matching using Dynamic Time Warping (DTW) and present the best achievable error rates for speaker-dependent and speaker-independent content matching. For this purpose, a Deep Neural Network/Hidden Markov Model Automatic Speech Recognition (DNN/HMM ASR) system is used to extract content-related posterior probabilities. This approach outperforms systems using Gaussian mixture model posteriors by at least 50% Equal Error Rate (EER) on the RSR2015 in content mismatch trials. DNN posteriors are also used in i-vector and JFA systems, obtaining EERs as low as 0.02%.
引用
收藏
页码:5050 / 5054
页数:5
相关论文
共 50 条
  • [21] Analysis of the Hilbert Spectrum for Text-Dependent Speaker Verification
    Sharma, Rajib
    Bhukya, Ramesh K.
    Prasanna, S. R. M.
    SPEECH COMMUNICATION, 2018, 96 : 207 - 224
  • [22] Towards Goat Detection in Text-Dependent Speaker Verification
    Toledo-Ronen, Orith
    Aronowitz, Hagai
    Hoory, Ron
    Pelecanos, Jason
    Nahamoo, David
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 16 - +
  • [23] Neural network models for combining evidence from spectral and suprasegmental features for text-dependent speaker verification
    Prasanna, SRM
    Zachariah, JM
    Yegnanarayana, B
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON INTELLIGENT SENSING AND INFORMATION PROCESSING, 2004, : 359 - 363
  • [24] END-TO-END ATTENTION BASED TEXT-DEPENDENT SPEAKER VERIFICATION
    Zhang, Shi-Xiong
    Chen, Zhuo
    Zhao, Yong
    Li, Jinyu
    Gong, Yifan
    2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 171 - 178
  • [25] EXPLOITING SEQUENCE INFORMATION FOR TEXT-DEPENDENT SPEAKER VERIFICATION
    Dey, Subhadeep
    Motlicek, Petr
    Madikeri, Srikanth
    Ferras, Marc
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5370 - 5374
  • [26] Text-dependent speaker recognition using the fuzzy ARTMAP neural network
    Woo, SC
    Lim, CP
    Osman, R
    IEEE 2000 TENCON PROCEEDINGS, VOLS I-III: INTELLIGENT SYSTEMS AND TECHNOLOGIES FOR THE NEW MILLENNIUM, 2000, : 33 - 38
  • [27] Lexicon-Based Local Representation for Text-Dependent Speaker Verification
    You, Hanxu
    Li, Wei
    Li, Lianqiang
    Zhu, Jie
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2017, E100D (03): : 587 - 589
  • [28] Text-dependent Speaker Verification Using Word-based Scoring
    Yao, Shengyu
    Huang, Houjun
    Zhou, Ruohua
    Yan, Yonghong
    2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 314 - 318
  • [29] Data Augmentation Enhanced Speaker Enrollment for Text-dependent Speaker Verification
    Sarkar, Achintya Kumar
    Sarma, Himangshu
    Dwivedi, Priyanka
    Tan, Zheng-Hua
    2020 3RD INTERNATIONAL CONFERENCE ON ENERGY, POWER AND ENVIRONMENT: TOWARDS CLEAN ENERGY TECHNOLOGIES (ICEPE 2020), 2021,
  • [30] Template-matching for text-dependent speaker verification
    Dey, Subhadeep
    Motlicek, Petr
    Madikeri, Srikanth
    Ferras, Marc
    SPEECH COMMUNICATION, 2017, 88 : 96 - 105