Video-Based Martial Arts Combat Action Recognition and Position Detection Using Deep Learning

被引:0
|
作者
Wu, Baoyuan [1 ,2 ,3 ]
Zhou, Jiali [4 ]
机构
[1] Chengdu Sport Univ, Sch Wushu, Chengdu 610093, Peoples R China
[2] Chengdu Sport Univ, Chinese Guoshu Acad, Chengdu 610093, Peoples R China
[3] Adamson Univ, Coll Educ, Manila 1000, Philippines
[4] Sichuan Technol & Business Univ, Sch Phys Educ, Chengdu 611745, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Art; Three-dimensional displays; Location awareness; Feature extraction; Skeleton; Training; Point cloud compression; Image recognition; Face recognition; Computational modeling; Deep learning; vision transformer; event detection; video classification; martial art; NETWORKS;
D O I
10.1109/ACCESS.2024.3487289
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Action recognition in martial arts can offer valuable insights for technicians, athletes, and coaches. Accurate action recognition can enhance performance analysis, inform training strategies, and improve decision-making processes by providing detailed evaluations of technique execution, movement patterns, and match dynamics. This can lead to more effective coaching, better athlete preparation, and a deeper understanding of competitive outcomes. Existing methods in human action recognition often struggle with challenges such as background clutter, occlusion, and variations in appearance and speed, particularly in dynamic combat scenarios. In this study, we proposed a novel Spatio-Temporal Hierarchical Keypoint Aggregation (ST-HKA) framework for martial arts combat action recognition and localization. The ST-HKA model effectively leverages a deep learning-based approach that treats human skeleton keypoints as 3D point clouds. Unlike conventional methods that use graph convolutional networks or appearance-based techniques, our approach adopts a point cloud paradigm to treat human keypoints as a 3D point cloud, significantly improving scalability and robustness against occlusion and variations in appearance. Additionally, we incorporate a weakly supervised spatio-temporal action localization mechanism using a Context-Aware Pooling Mechanism. The proposed model was evaluated on both the Kinetics Human-Action and Taekwondo datasets, demonstrating superior performance in recognizing complex martial arts actions. The ST-HKA model achieves a Top-1 Accuracy of 88.6% and an F1-score of 83.9% on the Kinetics dataset, and 88.7% accuracy and an F1-score of 84.4% on the Taekwondo dataset. The proposed model also exhibits higher precision in detecting precise temporal boundaries, as reflected by its strong performance in action localization tasks. These results highlight the effectiveness of ST-HKA in handling complex martial arts actions with high accuracy and robustness.
引用
收藏
页码:161357 / 161374
页数:18
相关论文
共 50 条
  • [1] FlowerAction: a federated deep learning framework for video-based human action recognition
    Thi Quynh Khanh Dinh
    Thanh-Hai Tran
    Trung-Kien Tran
    Thi-Lan Le
    Journal of Ambient Intelligence and Humanized Computing, 2025, 16 (2) : 459 - 470
  • [2] Motion-Driven Visual Tempo Learning for Video-Based Action Recognition
    Liu, Yuanzhong
    Yuan, Junsong
    Tu, Zhigang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 4104 - 4116
  • [3] A Video-Based Fire Detection Using Deep Learning Models
    Kim, Byoungjun
    Lee, Joonwhoan
    APPLIED SCIENCES-BASEL, 2019, 9 (14):
  • [4] Video-Based Human Activity Recognition Using Deep Learning Approaches
    Surek, Guilherme Augusto Silva
    Seman, Laio Oriel
    Stefenon, Stefano Frizzo
    Mariani, Viviana Cocco
    Coelho, Leandro dos Santos
    SENSORS, 2023, 23 (14)
  • [5] Video-Based Facial Expression Recognition Using a Deep Learning Approach
    Jangid, Mahesh
    Paharia, Pranjul
    Srivastava, Sumit
    ADVANCES IN COMPUTER COMMUNICATION AND COMPUTATIONAL SCIENCES, IC4S 2018, 2019, 924 : 653 - 660
  • [6] Learning Deep Representations for Video-Based Intake Gesture Detection
    Rouast, Philipp V.
    Adam, Marc T. P.
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2020, 24 (06) : 1727 - 1737
  • [7] Video-Based Detection of Generalized Tonic-Clonic Seizures Using Deep Learning
    Yang, Yonghua
    Sarkis, Rani A.
    El Atrache, Rima
    Loddenkemper, Tobias
    Meisel, Christian
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2021, 25 (08) : 2997 - 3008
  • [8] Video-Based Stress Detection through Deep Learning
    Zhang, Huijun
    Feng, Ling
    Li, Ningyun
    Jin, Zhanyu
    Cao, Lei
    SENSORS, 2020, 20 (19) : 1 - 17
  • [9] Video-based Human Fall Detection in Smart Homes Using Deep Learning
    Shojaei-Hashemi, Anahita
    Nasiopoulos, Panos
    Little, James J.
    Pourazad, Mahsa T.
    2018 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2018,
  • [10] Unsupervised Video-Based Action Recognition With Imagining Motion and Perceiving Appearance
    Lin, Wei
    Liu, Xiaoyu
    Zhuang, Yihong
    Ding, Xinghao
    Tu, Xiaotong
    Huang, Yue
    Zeng, Huanqiang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (05) : 2245 - 2258