A Unified Framework for Gesture Recognition and Spatiotemporal Gesture Segmentation

被引：211

作者：

Alon, Jonathan ^{[1
]}

Athitsos, Vassilis ^{[2
]}

Yuan, Quan ^{[1
]}

Sclaroff, Stan ^{[1
]}

机构：

[1] Boston Univ, Dept Comp Sci, Boston, MA 02215 USA

[2] Univ Texas Arlington, Dept Comp Sci & Engn, Arlington, TX 76019 USA

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2009年 / 31卷 / 09期

基金：

美国国家科学基金会;

关键词：

Gesture recognition; gesture spotting; human motion analysis; dynamic time warping; continuous dynamic programming; HIDDEN MARKOV-MODELS; HAND GESTURES; TRACKING;

D O I：

10.1109/TPAMI.2008.203

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Within the context of hand gesture recognition, spatiotemporal gesture segmentation is the task of determining, in a video sequence, where the gesturing hand is located and when the gesture starts and ends. Existing gesture recognition methods typically assume either known spatial segmentation or known temporal segmentation, or both. This paper introduces a unified framework for simultaneously performing spatial segmentation, temporal segmentation, and recognition. In the proposed framework, information flows both bottom-up and top-down. A gesture can be recognized even when the hand location is highly ambiguous and when information about when the gesture begins and ends is unavailable. Thus, the method can be applied to continuous image streams where gestures are performed in front of moving, cluttered backgrounds. The proposed method consists of three novel contributions: a spatiotemporal matching algorithm that can accommodate multiple candidate hand detections in every frame, a classifier-based pruning framework that enables accurate and early rejection of poor matches to gesture models, and a subgesture reasoning algorithm that learns which gesture models can falsely match parts of other longer gestures. The performance of the approach is evaluated on two challenging applications: recognition of hand-signed digits gestured by users wearing short-sleeved shirts, in front of a cluttered background, and retrieval of occurrences of signs of interest in a video database containing continuous, unsegmented signing in American Sign Language (ASL).

引用

页码：1685 / 1699

页数：15

共 56 条

[31] Active hand tracking
Martin, J
Devin, V
Crowley, JL
[J]. AUTOMATIC FACE AND GESTURE RECOGNITION - THIRD IEEE INTERNATIONAL CONFERENCE PROCEEDINGS, 1998, : 573 - 578
[32] Morguet P, 1998, 1998 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING - PROCEEDINGS, VOL 3, P193, DOI 10.1109/ICIP.1998.999009
[33] NAYAK S, 2005, P IEEE WORKSH VIS HU
[34] Real-time fingertip tracking and gesture recognition
Oka, K
Sato, Y
Koike, H
[J]. IEEE COMPUTER GRAPHICS AND APPLICATIONS, 2002, 22 (06) : 64 - 71
[35] Spotting method for classification of real world data
Oka, R
[J]. COMPUTER JOURNAL, 1998, 41 (08) : 559 - 565
[36] Ong EJ, 2004, SIXTH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, PROCEEDINGS, P889
[37] From HMM's to segment models: A unified view of stochastic modeling for speech recognition
Ostendorf, M
Digalakis, VV
Kimball, OA
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1996, 4 (05): : 360 - 378
[38] Visual interpretation of hand gestures for human-computer interaction: A review
Pavlovic, VI
Sharma, R
Huang, TS
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (07) : 677 - 695
[39] Hidden conditional random fields
Quattoni, Ariadna
Wang, Sybor
Morency, Louis-Philippe
Collins, Michael
Darrell, Trevor
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (10) : 1848 - 1853
[40] Neural network-based face detection
Rowley, HA
Baluja, S
Kanade, T
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (01) : 23 - 38

← 1 2 3 4 5 6 →