Evaluation of Speaker Verification Security and Detection of HMM-Based Synthetic Speech

被引:176
作者
De Leon, Phillip L. [1 ]
Pucher, Michael [2 ]
Yamagishi, Junichi [3 ]
Hernaez, Inma [4 ]
Saratxaga, Ibon [4 ]
机构
[1] New Mexico State Univ, Klipsch Sch Elect & Comp Engn, Las Cruces, NM 88003 USA
[2] Telecommun Res Ctr Vienna FTW, A-1220 Vienna, Austria
[3] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh EH8 9AB, Midlothian, Scotland
[4] Univ Basque Country, Bilbao 48013, Spain
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2012年 / 20卷 / 08期
基金
奥地利科学基金会; 英国工程与自然科学研究理事会;
关键词
Security; speaker recognition; speech synthesis; NORMALIZATION; ALGORITHMS; IMPOSTOR; SYSTEM;
D O I
10.1109/TASL.2012.2201472
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we evaluate the vulnerability of speaker verification (SV) systems to synthetic speech. The SV systems are based on either the Gaussian mixture model-universal background model (GMM-UBM) or support vector machine (SVM) using GMM supervectors. We use a hidden Markov model (HMM)-based text-to-speech (TTS) synthesizer, which can synthesize speech for a target speaker using small amounts of training data through model adaptation of an average voice or background model. Although the SV systems have a very low equal error rate (EER), when tested with synthetic speech generated from speaker models derived from the Wall Street Journal (WSJ) speech corpus, over 81% of the matched claims are accepted. This result suggests vulnerability in SV systems and thus a need to accurately detect synthetic speech. We propose a new feature based on relative phase shift (RPS), demonstrate reliable detection of synthetic speech, and show how this classifier can be used to improve security of SV systems.
引用
收藏
页码:2280 / 2290
页数:11
相关论文
共 50 条
[41]   Synthetic Speech Detection Based on the Temporal Consistency of Speaker Features [J].
Zhang, Yuxiang ;
Li, Zhuo ;
Lu, Jingze ;
Wang, Wenchao ;
Zhang, Pengyuan .
IEEE SIGNAL PROCESSING LETTERS, 2024, 31 :944-948
[42]   Continuous Control of the Degree of Articulation in HMM-based Speech Synthesis [J].
Picart, Benjamin ;
Drugman, Thomas ;
Dutoit, Thierly .
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, :1808-1811
[43]   x Formant-controlled HMM-based Speech Synthesis [J].
Lei, Ming ;
Yamagishi, Junichi ;
Richmond, Korin ;
Ling, Zhen-Hua ;
King, Simon ;
Dai, Li-Rong .
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, :2788-+
[44]   A Covariance-Tying Technique for HMM-Based Speech Synthesis [J].
Oura, Keiichiro ;
Zen, Heiga ;
Nankaku, Yoshihiko ;
Lee, Akinobu ;
Tokuda, Keiichi .
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (03) :595-601
[45]   Data Selection and Adaptation for Naturalness in HMM-based Speech Synthesis [J].
Cooper, Erica ;
Chang, Alison ;
Levitan, Yocheved ;
Hirschberg, Julia .
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, :357-+
[46]   Creation of HMM-based Speech Model for Estonian Text-to-Speech Synthesis [J].
Nurk, Tonis .
HUMAN LANGUAGE TECHNOLOGIES: THE BALTIC PERSPECTIVE, 2012, 247 :162-168
[47]   Improved Training of Excitation for HMM-based Parametric Speech Synthesis [J].
Shiga, Yoshinori ;
Toda, Tomoki ;
Sakai, Shinsuke ;
Kawai, Hisashi .
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, :809-812
[48]   An HMM-based Mandarin Chinese Text-to-Speech system [J].
Qian, Yao ;
Soong, Frank ;
Chen, Yining ;
Chu, Min .
CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 :223-+
[49]   HMM-Based Speech Synthesis Utilizing Glottal Inverse Filtering [J].
Raitio, Tuomo ;
Suni, Antti ;
Yamagishi, Junichi ;
Pulakka, Hannu ;
Nurminen, Jani ;
Vainio, Martti ;
Alku, Paavo .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (01) :153-165
[50]   Voiced/Unvoiced Decision Algorithm for HMM-based Speech Synthesis [J].
Kang, Shiyin ;
Shuang, Zhiwei ;
Duan, Quansheng ;
Qin, Yong ;
Cai, Lianhong .
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, :420-+