Dataset Transformation System for Sign Language Recognition Based on Image Classification Network

被引：3

作者：

Choi, Sang-Geun ^{[1
]}

Park, Yeonji ^{[1
]}

Sohn, Chae-Bong ^{[1
]}

机构：

[1] Kwangwoon Univ, Dept Elect & Commun Engn, Seoul 01897, South Korea

来源：

APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 19期

基金：

新加坡国家研究基金会;

关键词：

motion recognition; dataset transformation system; sign language recognition; spatial-temporal map (STmap); image classification model; HUMAN MOTION RECOGNITION;

D O I：

10.3390/app121910075

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Among the various fields where deep learning is used, there are challenges to be solved in motion recognition. One is that it is difficult to manage because of the vast amount of data. Another is that it takes a long time to learn due to the complex network and the large amount of data. To solve the problems, we propose a dataset transformation system. Sign language recognition was implemented to evaluate the performance of this system. The system consists of three steps: pose estimation, normalization, and spatial-temporal map (STmap) generation. STmap is a method of simultaneously expressing temporal data and spatial data in one image. In addition, the accuracy of the model was improved, and the error sensitivity was lowered through the data augmentation process. Through the proposed method, it was possible to reduce the dataset from 94.39 GB to 954 MB. It corresponds to approximately 1% of the original. When the dataset created through the proposed method is trained on the image classification model, the sign language recognition accuracy is 84.5%.

引用

页数：15

共 35 条

[1]

Ajiboye A. R., 2015, IJSECS, V1, P75, DOI DOI 10.15282/IJSECS.1.2015.6.0006

[2] Quaternion-Based Gesture Recognition Using Wireless Wearable Motion Capture Sensors [J].

Alavi, Shamir ;

Arsenault, Dennis ;

Whitehead, Anthony .

SENSORS, 2016, 16 (05)

[3]

Baccouche Moez, 2011, Human Behavior Unterstanding. Proceedings Second International Workshop, HBU 2011, P29, DOI 10.1007/978-3-642-25446-8_4

[4] A Hierarchical Ontology for Dialogue Acts in Psychiatric Interviews [J].

Bifis, Aristeidis ;

Trigka, Maria ;

Dedegkika, Sofia ;

Goula, Panagiota ;

Constantinopoulos, Constantinos ;

Kosmopoulos, Dimitrios .

THE 14TH ACM INTERNATIONAL CONFERENCE ON PERVASIVE TECHNOLOGIES RELATED TO ASSISTIVE ENVIRONMENTS, PETRA 2021, 2021, :330-337

[5] Dynamic Image Networks for Action Recognition [J].

Bilen, Hakan ;

Fernando, Basura ;

Gavves, Efstratios ;

Vedaldi, Andrea ;

Gould, Stephen .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3034-3042

[6]

Blackburn J, 2007, LECT NOTES COMPUT SC, V4814, P285

[7]

Bungeroth J., 2004, WORKSH REPR PROC SIG, P105

[8] Sign Language Transformers: Joint End-to-end Sign Language Recognition and Translation [J].

Camgoz, Necati Cihan ;

Koller, Oscar ;

Hadfield, Simon ;

Bowden, Richard .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :10020-10030

[9] Neural Sign Language Translation [J].

Camgoz, Necati Cihan ;

Hadfield, Simon ;

Koller, Oscar ;

Ney, Hermann ;

Bowden, Richard .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7784-7793

[10] Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields [J].

Cao, Zhe ;

Simon, Tomas ;

Wei, Shih-En ;

Sheikh, Yaser .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1302-1310

← 1 2 3 4 →