Audio-visual imposture

被引：0

作者：

Karam, Walid ^{[1
]}

Mokbel, Chafic ^{[1
]}

Greige, Hanna ^{[1
]}

Chollet, Gerard ^{[2
]}

机构：

[1] Univ Balamand, Dept Comp Sci, POB 100, Tripoli, Lebanon

[2] Ecole Natl Super Telecommun Bretagne, F-75634 Paris, France

来源：

MOBILE MULTIMEDIA/IMAGE PROCESSING FOR MILITARY AND SECURITY APPLICATIONS | 2006年 / 6250卷

关键词：

speaker verification; voice transformation; active appearance models; gaussian mixture models; modality fusion; face detection; face tracking; face model; MPEG-4;

D O I：

10.1117/12.665707

中图分类号：

TB8 [摄影技术];

学科分类号：

0804 ;

摘要：

A GMM based audio visual speaker verification system is described and an Active Appearance Model with a linear speaker transformation system is used to evaluate the robustness of the verification. An Active Appearance Model (AAM) is used to automatically locate and track a speaker's face in a video recording. A Gaussian Mixture Model (GMM) based classifier (BECARS) is used for face verification. GMM training and testing is accomplished on DCT based extracted features of the detected faces. On the audio side, speech features are extracted and used for speaker verification with the GMM based classifier. Fusion of both audio and video modalities for audio visual speaker verification is compared with face verification and speaker verification systems. To improve the robustness of the multimodal biometric identity verification system, an audio visual imposture system is envisioned. It consists of an automatic voice transformation technique that an impostor may use to assume the identity of an authorized client. Features of the transformed voice are then combined with the corresponding appearance features and fed into the GMM based system BECARS for training. An attempt is made to increase the acceptance rate of the impostor and to analyzing the robustness of the verification system. Experiments are being conducted on the BANCA database, with a prospect of experimenting on the newly developed PDAtabase developed within the scope of the SecurePhone project.

引用

页数：11

共 50 条

[31] EXPERIMENT IN AUDIO AND AUDIO-VISUAL GROUP THERAPY
GORDON, MT
BRITISH JOURNAL OF DISORDERS OF COMMUNICATION, 1969, 4 (01): : 83 - 88
[32] A JOINT AUDIO-VISUAL APPROACH TO AUDIO LOCALIZATION
Jensen, Jesper Rindom
Christensen, Mads Graesboll
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 454 - 458
[33] Expressive audio-visual speech
Bevacqua, E
Pelachaud, C
COMPUTER ANIMATION AND VIRTUAL WORLDS, 2004, 15 (3-4) : 297 - 304
[34] BUILDING AN AUDIO-VISUAL PROGRAM
Forrester, Gertrude
OCCUPATIONS-THE VOCATIONAL GUIDANCE JOURNAL, 1946, 25 (02): : 131 - 132
[35] Binaural Audio-Visual Localization
Wu, Xinyi
Wu, Zhenyao
Ju, Lili
Wang, Song
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 2961 - 2968
[36] Semantic Audio-Visual Navigation
Chen, Changan
Al-Halah, Ziad
Grauman, Kristen
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 15511 - 15520
[37] Audio-visual materials in classics
Siegel, Janice F.
CLASSICAL WORLD, 2008, 101 (03) : 335 - 419
[38] The structure of audio-visual consciousness
Skrzypulec, Blazej
SYNTHESE, 2021, 198 (03) : 2101 - 2127
[39] Audio-Visual Methods in Teaching
Hart, William G.
EDUCATIONAL RESEARCH BULLETIN, 1954, 33 (06): : 162 - 163
[40] NEW AUDIO-VISUAL STUDENT
MONODCASSIDY, H
MODERN LANGUAGE JOURNAL, 1966, 50 (01): : 15 - 18

← 1 2 3 4 5 →