3D action recognition;
depth map sequence;
CNN;
transfer learning;
bi-directional LSTM;
RNN;
attention;
BIDIRECTIONAL LSTM;
FUSION;
IMAGE;
2D;
D O I:
10.3390/s22186841
中图分类号:
O65 [分析化学];
学科分类号:
070302 ;
081704 ;
摘要:
Depth video sequence-based deep models for recognizing human actions are scarce compared to RGB and skeleton video sequences-based models. This scarcity limits the research advancements based on depth data, as training deep models with small-scale data is challenging. In this work, we propose a sequence classification deep model using depth video data for scenarios when the video data are limited. Unlike summarizing the frame contents of each frame into a single class, our method can directly classify a depth video, i.e., a sequence of depth frames. Firstly, the proposed system transforms an input depth video into three sequences of multi-view temporal motion frames. Together with the three temporal motion sequences, the input depth frame sequence offers a four-stream representation of the input depth action video. Next, the DenseNet121 architecture is employed along with ImageNet pre-trained weights to extract the discriminating frame-level action features of depth and temporal motion frames. The extracted four sets of feature vectors about frames of four streams are fed into four bi-directional (BLSTM) networks. The temporal features are further analyzed through multi-head self-attention (MHSA) to capture multi-view sequence correlations. Finally, the concatenated genre of their outputs is processed through dense layers to classify the input depth video. The experimental results on two small-scale benchmark depth datasets, MSRAction3D and DHA, demonstrate that the proposed framework is efficacious even for insufficient training samples and superior to the existing depth data-based action recognition methods.
机构:
Peking Univ, Sch Math Sci, Dept Informat Sci, Beijing, Peoples R China
Peking Univ, LMAM, Beijing, Peoples R ChinaPeking Univ, Sch Math Sci, Dept Informat Sci, Beijing, Peoples R China
Bulbul, Mohammad Farhad
Jiang, Yunsheng
论文数: 0引用数: 0
h-index: 0
机构:
Peking Univ, Sch Math Sci, Dept Informat Sci, Beijing, Peoples R China
Peking Univ, LMAM, Beijing, Peoples R ChinaPeking Univ, Sch Math Sci, Dept Informat Sci, Beijing, Peoples R China
Jiang, Yunsheng
Ma, Jinwen
论文数: 0引用数: 0
h-index: 0
机构:
Peking Univ, Sch Math Sci, Dept Informat Sci, Beijing, Peoples R China
Peking Univ, LMAM, Beijing, Peoples R ChinaPeking Univ, Sch Math Sci, Dept Informat Sci, Beijing, Peoples R China
机构:
Peking Univ, Sch Math Sci, Dept Informat Sci, Beijing, Peoples R China
Peking Univ, LMAM, Beijing, Peoples R ChinaPeking Univ, Sch Math Sci, Dept Informat Sci, Beijing, Peoples R China
Bulbul, Mohammad Farhad
Jiang, Yunsheng
论文数: 0引用数: 0
h-index: 0
机构:
Peking Univ, Sch Math Sci, Dept Informat Sci, Beijing, Peoples R China
Peking Univ, LMAM, Beijing, Peoples R ChinaPeking Univ, Sch Math Sci, Dept Informat Sci, Beijing, Peoples R China
Jiang, Yunsheng
Ma, Jinwen
论文数: 0引用数: 0
h-index: 0
机构:
Peking Univ, Sch Math Sci, Dept Informat Sci, Beijing, Peoples R China
Peking Univ, LMAM, Beijing, Peoples R ChinaPeking Univ, Sch Math Sci, Dept Informat Sci, Beijing, Peoples R China