Analysis of 3D hand trajectory gestures using stroke-based composite hidden Markov models

被引:27
作者
Kim, IC [1 ]
Chien, SI [1 ]
机构
[1] Kyungpook Natl Univ, Sch Elect & Elect Engn, Taegu 702701, South Korea
关键词
3D hand trajectory; gesture recognition; Polhemus sensor; stroke-based composition; hidden Markov model; coarticulation effect;
D O I
10.1023/A:1011231305559
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a glove-based hand gesture recognition system using hidden Markov models (HMMs) for recognizing the unconstrained 3D trajectory gestures of operators in a remote work environment. A Polhemus sensor attached to a PinchGlove is employed to obtain a sequence of 3D positions of a hand trajectory. The direct use of 3D data provides more naturalness in generating gestures, thereby avoiding some of the constraints usually imposed to prevent performance degradation when trajectory data are projected into a specific 2D plane. We use two kinds of HMMs according to the basic units to be modeled: gesture-based HMM and stroke-based HMM. The decomposition of gestures into more primitive strokes is quite attractive, since reversely concatenating stroke-based HMMs makes it possible to construct a new set of gesture-based HMMs. Any deterioration in performance and reliability arising from decomposition can be remedied by a fine-tuned relearning process for such composite HMMs. We also propose an efficient method of estimating a variable threshold of reliability for an HMM, which is found to be useful in rejecting unreliable patterns. In recognition experiments on 16 types of gestures defined for remote work, the fine-tuned composite HMM achieves the best performance of 96.88% recognition rate and also the highest reliability.
引用
收藏
页码:131 / 143
页数:13
相关论文
共 31 条
[1]  
[Anonymous], P IEEE INT WORKSH AU
[2]  
CARDIN R, 1993, P INT C ACOUST SPEEC, V2, P243
[3]   VARIABLE DURATION HIDDEN MARKOV MODEL AND MORPHOLOGICAL SEGMENTATION FOR HANDWRITTEN WORD RECOGNITION [J].
CHEN, MY ;
KUNDU, A ;
SRIHARI, SN .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 1995, 4 (12) :1675-1688
[4]  
Eickeler S, 1998, INT C PATT RECOG, P1206, DOI 10.1109/ICPR.1998.711914
[5]  
*FAK INC, 1995, FAK PINCHGL SYST INS
[6]   Glove-TalkII - A neural-network interface which maps gestures to parallel formant speech synthesizer controls [J].
Fels, SS ;
Hinton, GE .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1997, 8 (05) :977-984
[7]   GLOVE-TALK - A NEURAL NETWORK INTERFACE BETWEEN A DATA-GLOVE AND A SPEECH SYNTHESIZER [J].
FELS, SS ;
HINTON, GE .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1993, 4 (01) :2-8
[8]  
Fukunaga K., 1990, INTRO STAT PATTERN R
[9]   On the use of inter-word context-dependent units for word juncture modeling [J].
Giachin, E.P. ;
Lee, C.-H. ;
Rabiner, L.R. ;
Rosenberg, A.E. ;
Pieraccini, R. .
Computer Speech and Language, 1992, 6 (03) :197-213
[10]  
Giachin E. P., 1991, Computer Speech and Language, V5, P155, DOI 10.1016/0885-2308(91)90022-I