Evaluation of Speaker Verification Security and Detection of HMM-Based Synthetic Speech

被引:173
|
作者
De Leon, Phillip L. [1 ]
Pucher, Michael [2 ]
Yamagishi, Junichi [3 ]
Hernaez, Inma [4 ]
Saratxaga, Ibon [4 ]
机构
[1] New Mexico State Univ, Klipsch Sch Elect & Comp Engn, Las Cruces, NM 88003 USA
[2] Telecommun Res Ctr Vienna FTW, A-1220 Vienna, Austria
[3] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh EH8 9AB, Midlothian, Scotland
[4] Univ Basque Country, Bilbao 48013, Spain
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2012年 / 20卷 / 08期
基金
奥地利科学基金会; 英国工程与自然科学研究理事会;
关键词
Security; speaker recognition; speech synthesis; NORMALIZATION; ALGORITHMS; IMPOSTOR; SYSTEM;
D O I
10.1109/TASL.2012.2201472
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we evaluate the vulnerability of speaker verification (SV) systems to synthetic speech. The SV systems are based on either the Gaussian mixture model-universal background model (GMM-UBM) or support vector machine (SVM) using GMM supervectors. We use a hidden Markov model (HMM)-based text-to-speech (TTS) synthesizer, which can synthesize speech for a target speaker using small amounts of training data through model adaptation of an average voice or background model. Although the SV systems have a very low equal error rate (EER), when tested with synthetic speech generated from speaker models derived from the Wall Street Journal (WSJ) speech corpus, over 81% of the matched claims are accepted. This result suggests vulnerability in SV systems and thus a need to accurately detect synthetic speech. We propose a new feature based on relative phase shift (RPS), demonstrate reliable detection of synthetic speech, and show how this classifier can be used to improve security of SV systems.
引用
收藏
页码:2280 / 2290
页数:11
相关论文
共 50 条
  • [1] A hybrid score measurement for HMM-based speaker verification
    Gu, Y
    Thomas, T
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 317 - 320
  • [2] Speaker interpolation for HMM-based speech synthesis system
    Yoshimura, Takayoshi, 2000, Acoustical Soc Jpn, Tokyo, Japan (21):
  • [3] Speaker adaptation of pitch and spectrum for HMM-based speech synthesis
    Tamura, M., 1600, John Wiley and Sons Inc. (35):
  • [4] Frequency Warping for Speaker Adaptation in HMM-based Speech Synthesis
    Gao, Weixun
    Cao, Qiying
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2014, 30 (04) : 1149 - 1166
  • [5] HMM-Based Speaker Emotional Recognition Technology for Speech Signal
    Qin, Yuqiang
    Zhang, Xueying
    FRONTIERS OF MANUFACTURING SCIENCE AND MEASURING TECHNOLOGY, PTS 1-3, 2011, 230-232 : 261 - 265
  • [6] SPEAKER SIMILARITY EVALUATION OF FOREIGN-ACCENTED SPEECH SYNTHESIS USING HMM-BASED SPEAKER ADAPTATION
    Wester, Mirjam
    Karhila, Reima
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5372 - 5375
  • [7] Analysis of speaker clustering strategies for HMM-based speech synthesis
    Dall, Rasmus
    Veaux, Christophe
    Yamagishi, Junichi
    King, Simon
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 994 - 997
  • [8] EVALUATION OF OBJECTIVE MEASURES FOR INTELLIGIBILITY PREDICTION OF HMM-BASED SYNTHETIC SPEECH IN NOISE
    Valentini-Botinhao, Cassia
    Yamagishi, Junichi
    King, Simon
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5112 - 5115
  • [9] HMM-based integrated method for speaker-independent speech recognition
    Tsinghua Univ, Beijing, China
    Int Conf Signal Process Proc, (613-616):
  • [10] An On-line Speaker Adaptation Method for HMM-based Speech Recognizers
    Banhalmi, Andras
    Kocsor, Andras
    ACTA CYBERNETICA, 2008, 18 (03): : 379 - 390