Online Early-Late Fusion Based on Adaptive HMM for Sign Language Recognition

被引：45

作者：

Guo, Dan ^{[1
]}

Zhou, Wengang ^{[2
]}

Li, Houqiang ^{[2
]}

Wang, Meng ^{[1
]}

机构：

[1] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei 230601, Anhui, Peoples R China

[2] Univ Sci & Technol China, EEIS Dept, Hefei 230027, Anhui, Peoples R China

来源：

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS | 2018年 / 14卷 / 01期

关键词：

Sign language recognition; multi-modal feature fusion; query-adaptive; HMM; online algorithm;

D O I：

10.1145/3152121

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In sign language recognition (SLR) with multimodal data, a signword can be represented by multiply features, for which there exist an intrinsic property and a mutually complementary relationship among them. To fully explore those relationships, we propose an online early-late fusion method based on the adaptive Hidden Markov Model (HMM). In terms of the intrinsic property, we discover that inherent latent change states of each sign are related not only to the number of key gestures and body poses but also to their translation relationships. We propose an adaptive HMM method to obtain the hidden state number of each sign by affinity propagation clustering. For the complementary relationship, we propose an online early-late fusion scheme. The early fusion (feature fusion) is dedicated to preserving useful information to achieve a better complementary score, while the late fusion (score fusion) uncovers the significance of those features and aggregates them in a weighting manner. Different from classical fusion methods, the fusion is query adaptive. For different queries, after feature selection (including the combined feature), the fusion weight is inversely proportional to the area under the curve of the normalized query score list for each selected feature. The whole fusion process is effective and efficient. Experiments verify the effectiveness on the signer-independent SLR with large vocabulary. Compared either on different dataset sizes or to different SLR models, our method demonstrates consistent and promising performance.

引用

页数：18

共 56 条

[1] A Unified Framework for Gesture Recognition and Spatiotemporal Gesture Segmentation [J].

Alon, Jonathan ;

Athitsos, Vassilis ;

Yuan, Quan ;

Sclaroff, Stan .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2009, 31 (09) :1685-1699

[2]

[Anonymous], ECCV CHALEARN LOOK P

[3]

[Anonymous], 2016, PROC IEEE COMPUT VIS

[4]

[Anonymous], 2015, EURASIP J WIRELESS C

[5]

[Anonymous], 2014, LECT NOTES COMPUT SC, DOI DOI 10.1007/978-3-319-16634-6_18

[6]

[Anonymous], 2014, ACM T MULTIMEDIA COM

[7]

[Anonymous], 2013, BRIT MACH VIS C

[8]

[Anonymous], 2014, ABS14053531 CORR

[9]

[Anonymous], ACM T MULTIMEDIA COM

[10]

[Anonymous], 2012, P SIGCHI C HUM FACT

← 1 2 3 4 5 6 →