Sign Transition Modeling and a Scalable Solution to Continuous Sign Language Recognition for Real-World Applications

被引:44
作者
Li, Kehuang [1 ]
Zhou, Zhengyu [2 ,3 ]
Lee, Chin-Hui [1 ]
机构
[1] Georgia Inst Technol, Sch Elect & Comp Engn, 777 Atlantic Dr NW, Atlanta, GA 30332 USA
[2] Robert Bosch LLC, Res & Technol Ctr, Stuttgart, Germany
[3] Bosch Res & Technol Ctr North Amer, 4005 Miranda Ave,200, Palo Alto, CA 94304 USA
关键词
Sign language recognition; transition modeling; speech recognition; hidden Markov models;
D O I
10.1145/2850421
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We propose a new approach to modeling transition information between signs in continuous Sign Language Recognition (SLR) and address some scalability issues in designing SLR systems. In contrast to Automatic Speech Recognition (ASR) in which the transition between speech sounds is often brief and mainly addressed by the coarticulation effect, the sign transition in continuous SLR is far from being clear and usually not easily and exactly characterized. Leveraging upon hidden Markov modeling techniques from ASR, we proposed a modeling framework for continuous SLR having the following major advantages, namely: (i) the system is easy to scale up to large-vocabulary SLR; (ii) modeling of signs as well as the transitions between signs is robust even for noisy data collected in real-world SLR; and (iii) extensions to training, decoding, and adaptation are directly applicable even with new deep learning algorithms. A pair of low-cost digital gloves affordable for the deaf and hard of hearing community is used to collect a collection of training and testing data for real-world SLR interaction applications. Evaluated on 1,024 testing sentences from five signers, a word accuracy rate of 87.4% is achieved using a vocabulary of 510 words. The SLR speed is in real time, requiring an average of 0.69s per sentence. The encouraging results indicate that it is feasible to develop real-world SLR applications based on the proposed SLR framework.
引用
收藏
页数:23
相关论文
共 64 条
[61]  
Zafrulla Zahoor, 2011, ACM INT C MULTIMODAL, P279, DOI [DOI 10.1145/2070481.2070532, 10.1145/2070481.2070532]
[62]   Adaptive Sign Language Recognition With Exemplar Extraction and MAP/IVFS [J].
Zhou, Yu ;
Chen, Xilin ;
Zhao, Debin ;
Yao, Hongxun ;
Gao, Wen .
IEEE SIGNAL PROCESSING LETTERS, 2010, 17 (03) :297-300
[63]  
Zhou Z., 2009, THESIS
[64]  
Zhou Zhengyu, 2015, Invention Patent (Provisional), Application, Patent No. 62148204