Multimodal Biometric Human Recognition for Perceptual Human-Computer Interaction

被引:30
作者
Jiang, Richard M. [1 ]
Sadka, Abdul H. [2 ]
Crookes, Danny [3 ]
机构
[1] Univ Loughborough, Sch Comp Sci, Loughborough LE11 3TJ, Leics, England
[2] Brunel Univ, Dept Elect & Comp Engn, London UB8 2QE, England
[3] Queens Univ Belfast, Sch Elect Elect Engn & Comp Sci, Inst Elect Commun & Informat Technol, Belfast BT7, Antrim, North Ireland
来源
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS | 2010年 / 40卷 / 06期
关键词
Laplacian Eigenmap; low-level feature fusion; multimodal biometrics; perceptual human-computer interaction (HCI); speaker recognition; TRACKING; SPEECH; FUSION;
D O I
10.1109/TSMCC.2010.2050476
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a novel video-based multimodal biometric verification scheme using the subspace-based low-level feature fusion of face and speech is developed for specific speaker recognition for perceptual human-computer interaction (HCI). In the proposed scheme, human face is tracked and face pose is estimated to weight the detected facelike regions in successive frames, where ill-posed faces and false-positive detections are assigned with lower credit to enhance the accuracy. In the audio modality, mel-frequency cepstral coefficients are extracted for voice-based biometric verification. In the fusion step, features from both modalities are projected into nonlinear Laplacian Eigenmap subspace formultimodal speaker recognition and combined at low level. The proposed approach is tested on the video database of ten human subjects, and the results show that the proposed scheme can attain better accuracy in comparison with the conventional multimodal fusion using latent semantic analysis as well as the single-modality verifications. The experiment on MATLAB shows the potential of the proposed scheme to attain the real-time performance for perceptual HCI applications.
引用
收藏
页码:676 / 681
页数:6
相关论文
共 32 条
[1]   Improved multiple target tracking via global motion compensation and optoelectronic correlation [J].
Alam, Mohammad S. ;
Bal, Abdullah .
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2007, 54 (01) :522-529
[2]   Audio-visual biometrics [J].
Aleksic, Petar S. ;
Katsaggelos, Aggelos K. .
PROCEEDINGS OF THE IEEE, 2006, 94 (11) :2025-2044
[3]  
[Anonymous], 2001, Robotica, DOI DOI 10.1017/S0263574700223217
[4]   Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection [J].
Belhumeur, PN ;
Hespanha, JP ;
Kriegman, DJ .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (07) :711-720
[5]   PERSON IDENTIFICATION USING MULTIPLE CUES [J].
BRUNELLI, R ;
FALAVIGNA, D .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1995, 17 (10) :955-966
[6]   Image processing and behavior planning for intelligent vehicles [J].
Bücher, T ;
Curio, C ;
Edelbrunner, J ;
Igel, C ;
Kastrup, D ;
Leefken, I ;
Lorenz, G ;
Steinhage, A ;
von Seelen, W .
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2003, 50 (01) :62-75
[7]   Interrelation between speech and facial gestures in emotional utterances: A single subject study [J].
Busso, Carlos ;
Narayanan, Shrikanth S. .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (08) :2331-2347
[8]   Multimodal decision-level fusion for person authentication [J].
Chatzis, V ;
Bors, AG ;
Pitas, I .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 1999, 29 (06) :674-680
[9]   Score bi-Gaussian equalisation for multimodal person verification [J].
Ejarque, P. ;
Hernando, J. .
IET SIGNAL PROCESSING, 2009, 3 (04) :322-332
[10]   Multimodal biometric databases: An overview [J].
Faundez-Zanuy, Marcos ;
Fierrez-Aguilar, Julian ;
Ortega-Garcia, Javier ;
Gonzalez-Rodriguez, Joaquin .
IEEE AEROSPACE AND ELECTRONIC SYSTEMS MAGAZINE, 2006, 21 (08) :29-37