Adaptive Feature Selection With Reinforcement Learning for Skeleton-Based Action Recognition

被引:9
作者
Xu, Zheyuan [1 ]
Wang, Yingfu [1 ]
Jiang, Jiaqin [1 ]
Yao, Jian [1 ,2 ]
Li, Liang [3 ]
机构
[1] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430072, Peoples R China
[2] Open Univ Guangdong, Sch Arti cial Intelligence, Guangzhou 510091, Peoples R China
[3] Shenzhen Polytech, Shenzhen 518055, Peoples R China
基金
中国国家自然科学基金;
关键词
Action recognition; skeleton; feature selection; reinforcement learning; graph convolutional network;
D O I
10.1109/ACCESS.2020.3038235
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Skeleton-based action recognition has attracted extensive attention recently in the computer vision community. Previous studies, especially GCN-based methods, have presented remarkable improvements for this task. However, in existing GCN-based methods, global average pooling is applied to the extracted features before the classifier. This may hurt the recognition performance since it neglects the fact that not all features are equally important in the temporal dimension. To tackle this issue, in this article, we propose a feature selection network (FSN) with actor-critic reinforcement learning. Given the extracted feature sequence, FSN learns to adaptively select the most representative features and discard the ambiguous features for action recognition. In addition, conventional graph convolution is a local operation, it cannot fully capture the non-local joint dependencies that could be vital to recognize the action. Thus, we also propose a generalized graph generation module to capture latent dependencies and further propose a generalized graph convolution network (GGCN). The GGCN and FSN are combined in a three-stream recognition framework, in which different types of information from skeleton data are further fused to improve the recognition accuracy. Extensive experiments demonstrate that the proposed FSN is a flexible and effective module that can cooperate with any existing GCN-based framework to enhance the recognition accuracy, the proposed GGCN can extract richer skeleton features for skeleton-based action recognition, and our method achieves superior performance over several public datasets, e.g. 95.7 top-1 accuracy on NTU-RGB+D, 86.7 top-1 accuracy on NTU-RGB+D 120, etc.
引用
收藏
页码:213038 / 213051
页数:14
相关论文
共 63 条
[1]   Human activity recognition from 3D data: A review [J].
Aggarwal, J. K. ;
Xia, Lu .
PATTERN RECOGNITION LETTERS, 2014, 48 :70-80
[2]   NEURONLIKE ADAPTIVE ELEMENTS THAT CAN SOLVE DIFFICULT LEARNING CONTROL-PROBLEMS [J].
BARTO, AG ;
SUTTON, RS ;
ANDERSON, CW .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1983, 13 (05) :834-846
[3]  
Bruna J, 2013, 2 INT C LEARN REPR I
[4]   Remote Sensing Image Classification Based on a Cross-Attention Mechanism and Graph Convolution [J].
Cai, Weiwei ;
Wei, Zhanguo .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[5]   Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields [J].
Cao, Zhe ;
Simon, Tomas ;
Wei, Shih-En ;
Sheikh, Yaser .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1302-1310
[6]   Cascaded Pyramid Network for Multi-Person Pose Estimation [J].
Chen, Yilun ;
Wang, Zhicheng ;
Peng, Yuxiang ;
Zhang, Zhiqiang ;
Yu, Gang ;
Sun, Jian .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7103-7112
[7]   Hard Sample Mining and Learning for Skeleton-Based Human Action Recognition and Identification [J].
Cui, Ran ;
Hua, Gang ;
Zhu, Aichun ;
Wu, Jingran ;
Liu, Haiqiang .
IEEE ACCESS, 2019, 7 :8245-8257
[8]  
Du Y, 2015, PROC CVPR IEEE, P1110, DOI 10.1109/CVPR.2015.7298714
[9]   Integrating perceptual and cognitive modeling for adaptive and intelligent human-computer interaction [J].
Duric, Z ;
Gray, WD ;
Heishman, R ;
Li, FY ;
Rosenfeld, A ;
Schoelles, MJ ;
Schunn, C ;
Wechsler, H .
PROCEEDINGS OF THE IEEE, 2002, 90 (07) :1272-1289
[10]  
Fernando B, 2015, PROC CVPR IEEE, P5378, DOI 10.1109/CVPR.2015.7299176