Audio-visual imposture

被引:0
|
作者
Karam, Walid [1 ]
Mokbel, Chafic [1 ]
Greige, Hanna [1 ]
Chollet, Gerard [2 ]
机构
[1] Univ Balamand, Dept Comp Sci, POB 100, Tripoli, Lebanon
[2] Ecole Natl Super Telecommun Bretagne, F-75634 Paris, France
来源
MOBILE MULTIMEDIA/IMAGE PROCESSING FOR MILITARY AND SECURITY APPLICATIONS | 2006年 / 6250卷
关键词
speaker verification; voice transformation; active appearance models; gaussian mixture models; modality fusion; face detection; face tracking; face model; MPEG-4;
D O I
10.1117/12.665707
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
A GMM based audio visual speaker verification system is described and an Active Appearance Model with a linear speaker transformation system is used to evaluate the robustness of the verification. An Active Appearance Model (AAM) is used to automatically locate and track a speaker's face in a video recording. A Gaussian Mixture Model (GMM) based classifier (BECARS) is used for face verification. GMM training and testing is accomplished on DCT based extracted features of the detected faces. On the audio side, speech features are extracted and used for speaker verification with the GMM based classifier. Fusion of both audio and video modalities for audio visual speaker verification is compared with face verification and speaker verification systems. To improve the robustness of the multimodal biometric identity verification system, an audio visual imposture system is envisioned. It consists of an automatic voice transformation technique that an impostor may use to assume the identity of an authorized client. Features of the transformed voice are then combined with the corresponding appearance features and fed into the GMM based system BECARS for training. An attempt is made to increase the acceptance rate of the impostor and to analyzing the robustness of the verification system. Experiments are being conducted on the BANCA database, with a prospect of experimenting on the newly developed PDAtabase developed within the scope of the SecurePhone project.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Audio-visual event detection based on mining of semantic audio-visual labels
    Goh, KS
    Miyahara, K
    Radhakrishan, R
    Xiong, ZY
    Divakaran, A
    STORAGE AND RETRIEVAL METHODS AND APPLICATIONS FOR MULTIMEDIA 2004, 2004, 5307 : 292 - 299
  • [22] LEARNING CONTEXTUALLY FUSED AUDIO-VISUAL REPRESENTATIONS FOR AUDIO-VISUAL SPEECH RECOGNITION
    Zhang, Zi-Qiang
    Zhang, Jie
    Zhang, Jian-Shu
    Wu, Ming-Hui
    Fang, Xin
    Dai, Li-Rong
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1346 - 1350
  • [23] Audio-Visual Causality and Stimulus Reliability Affect Audio-Visual Synchrony Perception
    Li, Shao
    Ding, Qi
    Yuan, Yichen
    Yue, Zhenzhu
    FRONTIERS IN PSYCHOLOGY, 2021, 12
  • [24] AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation
    Choi, Jeongsoo
    Park, Se Jin
    Kim, Minsu
    Ro, Yong Man
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 27315 - 27327
  • [25] Measuring the visual in audio-visual input
    Pujadas, Georgia
    Munoz, Carmen
    ITL-INTERNATIONAL JOURNAL OF APPLIED LINGUISTICS, 2023, 174 (02) : 263 - 290
  • [26] A Robust Audio-visual Speech Recognition Using Audio-visual Voice Activity Detection
    Tamura, Satoshi
    Ishikawa, Masato
    Hashiba, Takashi
    Takeuchi, Shin'ichi
    Hayamizu, Satoru
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2702 - +
  • [27] Audio-visual speech enhancement with AVCDCN (audio-visual codebook dependent cepstral normalization)
    Deligne, S
    Potamianos, G
    Neti, C
    SAM2002: IEEE SENSOR ARRAY AND MULTICHANNEL SIGNAL PROCESSING WORKSHOP PROCEEDINGS, 2002, : 68 - 71
  • [28] Audio-visual speech experience with age influences perceived audio-visual asynchrony in speech
    Alm, M. (magnus.alm@svt.ntnu.no), 1600, Acoustical Society of America (134):
  • [29] Audio-visual spatial alignment improves integration in the presence of a competing audio-visual stimulus
    Fleming, Justin T.
    Noyce, Abigail L.
    Shinn-Cunningham, Barbara G.
    NEUROPSYCHOLOGIA, 2020, 146
  • [30] Audio-visual speech experience with age influences perceived audio-visual asynchrony in speech
    Alm, Magnus
    Behne, Dawn
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2013, 134 (04): : 3001 - 3010