SPOOFING DETECTION VIA SIMULTANEOUS VERIFICATION OF AUDIO-VISUAL SYNCHRONICITY AND TRANSCRIPTION

被引：0

作者：

Schoenherr, Lea ^{[1
]}

Zeiler, Steffen ^{[1
]}

Kolossa, Dorothea ^{[1
]}

机构：

[1] Ruhr Univ Bochum, Inst Commun Acoust, Bochum, Germany

来源：

2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU) | 2017年

关键词：

spoofing detection; liveness detection; audio-visual speaker recognition; multimodal biometrics; coupled hidden Markov models; ROBUST;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Acoustic speaker recognition systems are very vulnerable to spoofing attacks via replayed or synthesized utterances. One possible countermeasure is audio-visual speaker recognition. Nevertheless, the addition of the visual stream alone does not prevent spoofing attacks completely and only provides further information to assess the authenticity of the utterance. Many systems consider audio and video modalities independently and can easily be spoofed by imitating only a single modality or by a bimodal replay attack with a victim's photograph or video. Therefore, we propose the simultaneous verification of the data synchronicity and the transcription in a challenge-response setup. We use coupled hidden Markov models (CHMMs) for a text-dependent spoofing detection and introduce new features that provide information about the transcriptions of the utterance and the synchronicity of both streams. We evaluate the features for various spoofing scenarios and show that the combination of the features leads to a more robust recognition, also in comparison to the baseline method. Additionally, by evaluating the data on unseen speakers, we show the spoofing detection to be applicable in speaker-independent use-cases.

引用

页码：591 / 598

页数：8

共 22 条

[1] Learning Dynamic Stream Weights For Coupled-HMM-Based Audio-Visual Speech Recognition [J].

Abdelaziz, Ahmed Hussen ;

Zeiler, Steffen ;

Kolossa, Dorothea .

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (05) :863-876

[2] Text-Dependent Audiovisual Synchrony Detection for Spoofing Detection in Mobile Person Recognition [J].

Aides, Amit ;

Aronowitz, Hagai .

17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, :2125-2129

[3]

Alam MR, 2015, INT CONF BIOMETR THE

[4] Audio-visual biometrics [J].

Aleksic, Petar S. ;

Katsaggelos, Aggelos K. .

PROCEEDINGS OF THE IEEE, 2006, 94 (11) :2025-2044

[5]

[Anonymous], 2007, EURASIP J APPL SIG P

[6]

[Anonymous], THESIS

[7] Audio-visual speech asynchrony detection using co-inertia analysis and coupled hidden markov models [J].

Argones Rua, Enrique ;

Bredin, Herve ;

Garcia Mateo, Carmen ;

Chollet, Gerard ;

Gonzalez Jimenez, Daniel .

PATTERN ANALYSIS AND APPLICATIONS, 2009, 12 (03) :271-284

[8] Audiovisual synchrony assessment for replay attack detection in talking face biometrics [J].

Boutellaa, Elhocine ;

Boulkenafet, Zinelabidine ;

Komulainen, Jukka ;

Hadid, Abdenour .

MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (09) :5329-5343

[9] Making talking-face authentication robust to deliberate imposture [J].

Bredin, Hervie ;

Chollet, Gerard .

2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, :1693-1696

[10]

Dean D., 2006, P HCSNET WORKSH US V, P87

← 1 2 3 →