Demo: The Sound of Silence: End-to-End Sign Language Recognition Using SmartWatch

被引:9
作者
Dai, Qian [1 ]
Hou, Jiahui [2 ]
Yang, Panlong [1 ]
Li, Xiangyang [1 ]
Wang, Fei [1 ]
Zhang, Xumiao [1 ]
机构
[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
[2] IIT, Chicago, IL 60616 USA
来源
PROCEEDINGS OF THE 23RD ANNUAL INTERNATIONAL CONFERENCE ON MOBILE COMPUTING AND NETWORKING (MOBICOM '17) | 2017年
关键词
Wearable computing; activity recognition; mobile sensing;
D O I
10.1145/3117811.3119853
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Sign Language is a natural and fully-fledged communication method for deaf and hearing-impaired people. In this demo, we propose the first SmartWatch-based American sign language (ASL) recognition system, which is more comfortable, portable and user-friendly and offers accessibility anytime, anywhere. This system is based on the intuitive idea that each sign has its specific motion pattern which can be transformed into unique gyroscope and accelerometer signals and then analyzed and learned by using Long-Short term memory recurrent neural network (LSTM-RNN) trained with connectionist temporal classification (CTC). In this way, signs and context information can be correctly recognized based on an off the-shelf device (e.g. SmartWatch, Smartphone). The experiments show that, in the Known user split task, our system reaches an average word error rate of 7.29% to recognize 73 sentences formed by 103 ASL signs and achieves detection ratio up to 93.7% for a single sign. The result also shows our system has a good adaptation, even including new users, it can achieve an average word error rate of 21.6% at the sentence level and reach an average detection ratio of 79.4%. Moreover, our system performs real time ASL translation, outputting the speech within 1.69 seconds for a sentence of 12 signs in average.
引用
收藏
页码:462 / 464
页数:3
相关论文
共 12 条
[1]  
ABADI M, 2015, TENSORFLOW LARGE SCA, DOI DOI 10.48550/ARXIV.1605.08695
[2]  
[Anonymous], 2016, ARXIV160202830
[3]  
[Anonymous], P 15 INT ACM SIGACCE
[4]  
Dong C., 2015, P IEEE C COMP VIS PA, P44, DOI DOI 10.1109/CVPRW.2015.7301347
[5]  
Graves A., 2006, PROC INT C MACH LEAR, P369, DOI DOI 10.1145/1143844.1143891
[6]  
Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]
[7]  
Kelly D., 2009, Proceedings of the 2009 International Conference on Multimodal Interfaces, P351, DOI DOI 10.1145/1647314.1647387
[8]  
Koller Oscar, 2016, P BRIT MACH VIS C 20, DOI DOI 10.5244/C.30.136
[9]   Sign Transition Modeling and a Scalable Solution to Continuous Sign Language Recognition for Real-World Applications [J].
Li, Kehuang ;
Zhou, Zhengyu ;
Lee, Chin-Hui .
ACM TRANSACTIONS ON ACCESSIBLE COMPUTING, 2016, 8 (02)
[10]   Glove-Based Continuous Arabic Sign Language Recognition in User-Dependent Mode [J].
Tubaiz, Noor ;
Shanableh, Tamer ;
Assaleh, Khaled .
IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, 2015, 45 (04) :526-533