An Efficient Feature Fusion of Graph Convolutional Networks and Its Application for Real-Time Traffic Control Gestures Recognition

被引:4
作者
Dinh-Tan Pham [1 ,2 ]
Quang-Tien Pham [3 ]
Thi-Lan Le [2 ,3 ]
Hai Vu [2 ,3 ]
机构
[1] Hanoi Univ Min & Geol, Fac Informat Technol, Hanoi 100000, Vietnam
[2] Hanoi Univ Sci & Technol, MICA Int Res Inst, Comp Vis Dept, Hanoi 100000, Vietnam
[3] Hanoi Univ Sci & Technol, Sch Elect & Telecommun, Hanoi 100000, Vietnam
关键词
Joints; Skeleton; Feature extraction; Real-time systems; Convolutional neural networks; Gesture recognition; Data models; Autonomous vehicles; graph convolutional network; human action recognition; traffic control gestures;
D O I
10.1109/ACCESS.2021.3109255
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, skeleton-based gesture and action recognition have emerged thanks to the progress in human pose estimation. Gesture representation using skeletal data is robust since skeletal data are invariant to the individual's appearance. Among different approaches proposed for skeleton-based action/gesture recognition, Graph Convolutional Network (GCN) and its variations have obtained great attention thanks to its ability to capture the graph essence of the skeletal data. In this paper, we aim to design an efficient scheme using relative joints of skeleton sequences adapted in a GCN framework. Both spatial features (i.e., joint positions) and temporal ones (i.e., the velocity of joints) are combined to form the input of Attention-enhanced Adaptive GCN (AAGCN). The proposed framework deals with limitations of the original AAGCN when it works on challenging datasets with incomplete and noisy skeletal data. Extensive experiments are carried out on three datasets CMDFALL, MICA-Action3D, NTU-RGBD. Experimental results show that the proposed method achieves superior performance compared with existing methods. Moreover, to illustrate the application of the proposed method in real-time traffic control gesture recognition for autonomous vehicles, we have evaluated the proposed method on the TCG dataset. The obtained results show that the proposed method offers real-time computation capability and good recognition results. These results suggest a promising solution to deploy a real-time and robust recognition technique for gesture-based traffic control in autonomous vehicles.
引用
收藏
页码:121930 / 121943
页数:14
相关论文
共 40 条
[1]  
Cho K., 2014, P C EMP METH NAT LAN, P1724, DOI DOI 10.3115/V1/D14-1179
[2]  
Du Y, 2015, PROC CVPR IEEE, P1110, DOI 10.1109/CVPR.2015.7298714
[3]  
Ghorbel E, 2015, INT CONF IMAG PROC, P61, DOI 10.1109/IPTA.2015.7367097
[4]   Gesture recognition of traffic police based on static and dynamic descriptor fusion [J].
Guo, Fan ;
Tang, Jin ;
Wang, Xile .
MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (06) :8915-8936
[5]  
Hoang V.-N., 2019, 2019 INT C MULT AN P, P1
[6]  
Jin C.-B., 2018, INTELLIGENT VIDEO SU
[7]   A New Representation of Skeleton Sequences for 3D Action Recognition [J].
Ke, Qiuhong ;
Bennamoun, Mohammed ;
An, Senjian ;
Sohel, Ferdous ;
Boussaid, Farid .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4570-4579
[8]   Interpretable 3D Human Action Analysis with Temporal Convolutional Networks [J].
Kim, Tae Soo ;
Reiter, Austin .
2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, :1623-1631
[9]  
Le QK, 2012, IEEE T SMART PROCESS, V1, P1
[10]  
Le T.-L., 2020, INT C COMM EL ICCE, P1