Yoga Posture Recognition by Learning Spatial-Temporal Feature with Deep Learning Techniques

被引：2

作者：

Palanimeera, J. ^{[1
]}

Ponmozhi, K. ^{[1
]}

机构：

[1] Kalasalingam Acad Res & Educ, Dept Comp Applicat, Krishnankoil, Tamil Nadu, India

来源：

INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS | 2024年 / 24卷 / 06期

关键词：

Identifying yoga postures; deep learning algorithms; RGB camera; spatial and temporal; open pose; Dense-BiLSTM; NETWORKS;

D O I：

10.1142/S0219467824500554

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Yoga posture recognition remains a difficult issue because of crowded backgrounds, varied settings, occlusions, viewpoint alterations, and camera motions, despite recent promising advances in deep learning. In this paper, the method for accurately detecting various yoga poses using DL (Deep Learning) algorithms is provided. Using a standard RGB camera, six yoga poses - Sukhasana, Kakasana, Naukasana, Dhanurasana, Tadasana, and Vrikshasana - were captured on ten people, five men and five women. In this study, a brand-new DL model is presented for representing the spatio-temporal (ST) variation of skeleton-based yoga poses in movies. It is advised to use a variety of representation learners to pry video-level temporal recordings, which combine spatio-temporal sampling with long-range time mastering to produce a successful and effective training approach. A novel feature extraction method using Open Pose is described, together with a DenceBi-directional LSTM network to represent spatial-temporal links in both the forward and backward directions. This will increase the efficacy and consistency of modeling long-range action detection. To improve temporal pattern modeling capability, they are stacked and combined with dense skip connections. To improve performance, two modalities from look and motion are fused with a fusion module and compared to other deep learning models are LSTMs including LSTM, Bi-LSTM, Res-LSTM, and Res-BiLSTM. Studies on real-time datasets of yoga poses show that the suggested DenseBi-LSTM model performs better and yields better results than state-of-the-art techniques for yoga pose detection.

引用

页数：24

共 46 条

[1]

Abdulmunem A., 2017, P INT C PATT REC

[2]

[Anonymous], 2007, MM

[3] Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition [J].

Chen, Yuxin ;

Zhang, Ziqi ;

Yuan, Chunfeng ;

Li, Bing ;

Deng, Ying ;

Hu, Weiming .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :13339-13348

[4] Human action recognition in videos based on spatiotemporal features and bag-of-poses [J].

da Silva, Murilo Varges ;

Marana, Aparecido Nilceu .

APPLIED SOFT COMPUTING, 2020, 95

[5]

Dittakavi B., 2022, P IEEE CVF C COMP VI, P3540

[6] Long-Term Recurrent Convolutional Networks for Visual Recognition and Description [J].

Donahue, Jeff ;

Hendricks, Lisa Anne ;

Rohrbach, Marcus ;

Venugopalan, Subhashini ;

Guadarrama, Sergio ;

Saenko, Kate ;

Darrell, Trevor .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (04) :677-691

[7] Human action recognition on depth dataset [J].

Gao, Zan ;

Zhang, Hua ;

Liu, Anan A. ;

Xu, Guangping ;

Xue, Yanbing .

NEURAL COMPUTING & APPLICATIONS, 2016, 27 (07) :2047-2054

[8] Actions as space-time shapes [J].

Gorelick, Lena ;

Blank, Moshe ;

Shechtman, Eli ;

Irani, Michal ;

Basri, Ronen .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (12) :2247-2253

[9]

Graves A, 2013, 2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), P273, DOI 10.1109/ASRU.2013.6707742

[10]

Grushin A, 2013, IEEE IJCNN

← 1 2 3 4 5 →