Video-Based Martial Arts Combat Action Recognition and Position Detection Using Deep Learning

被引:0
|
作者
Wu, Baoyuan [1 ,2 ,3 ]
Zhou, Jiali [4 ]
机构
[1] Chengdu Sport Univ, Sch Wushu, Chengdu 610093, Peoples R China
[2] Chengdu Sport Univ, Chinese Guoshu Acad, Chengdu 610093, Peoples R China
[3] Adamson Univ, Coll Educ, Manila 1000, Philippines
[4] Sichuan Technol & Business Univ, Sch Phys Educ, Chengdu 611745, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Art; Three-dimensional displays; Location awareness; Feature extraction; Skeleton; Training; Point cloud compression; Image recognition; Face recognition; Computational modeling; Deep learning; vision transformer; event detection; video classification; martial art; NETWORKS;
D O I
10.1109/ACCESS.2024.3487289
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Action recognition in martial arts can offer valuable insights for technicians, athletes, and coaches. Accurate action recognition can enhance performance analysis, inform training strategies, and improve decision-making processes by providing detailed evaluations of technique execution, movement patterns, and match dynamics. This can lead to more effective coaching, better athlete preparation, and a deeper understanding of competitive outcomes. Existing methods in human action recognition often struggle with challenges such as background clutter, occlusion, and variations in appearance and speed, particularly in dynamic combat scenarios. In this study, we proposed a novel Spatio-Temporal Hierarchical Keypoint Aggregation (ST-HKA) framework for martial arts combat action recognition and localization. The ST-HKA model effectively leverages a deep learning-based approach that treats human skeleton keypoints as 3D point clouds. Unlike conventional methods that use graph convolutional networks or appearance-based techniques, our approach adopts a point cloud paradigm to treat human keypoints as a 3D point cloud, significantly improving scalability and robustness against occlusion and variations in appearance. Additionally, we incorporate a weakly supervised spatio-temporal action localization mechanism using a Context-Aware Pooling Mechanism. The proposed model was evaluated on both the Kinetics Human-Action and Taekwondo datasets, demonstrating superior performance in recognizing complex martial arts actions. The ST-HKA model achieves a Top-1 Accuracy of 88.6% and an F1-score of 83.9% on the Kinetics dataset, and 88.7% accuracy and an F1-score of 84.4% on the Taekwondo dataset. The proposed model also exhibits higher precision in detecting precise temporal boundaries, as reflected by its strong performance in action localization tasks. These results highlight the effectiveness of ST-HKA in handling complex martial arts actions with high accuracy and robustness.
引用
收藏
页码:161357 / 161374
页数:18
相关论文
共 50 条
  • [21] An Interpretable Modular Deep Learning Framework for Video-Based Fall Detection
    Dutt, Micheal
    Gupta, Aditya
    Goodwin, Morten
    Omlin, Christian W.
    APPLIED SCIENCES-BASEL, 2024, 14 (11):
  • [22] Multiple Instance Learning with Deep Instance Selection for Video-based Face Recognition
    Liu, Ning
    PROCEEDINGS OF THE 2016 3RD INTERNATIONAL CONFERENCE ON MECHATRONICS AND INFORMATION TECHNOLOGY (ICMIT), 2016, 49 : 327 - 332
  • [23] Footballer Detection on Position Based Classification Recognition using Deep Learning Approach
    Rashid, Fadilla Atyka Nor
    Liew, Siaw-Hong
    2022 INTERNATIONAL CONFERENCE ON GREEN ENERGY, COMPUTING AND SUSTAINABLE TECHNOLOGY (GECOST), 2022, : 193 - 197
  • [24] Video-based driver emotion recognition using hybrid deep spatio-temporal feature learning
    Varma, Harshit
    Ganapathy, Nagarajan
    Deserno, Thomas M.
    MEDICAL IMAGING 2022: IMAGING INFORMATICS FOR HEALTHCARE, RESEARCH, AND APPLICATIONS, 2022, 12037
  • [25] A Novel Action Recognition Framework Based on Deep-Learning and Genetic Algorithms
    Yilmaz, Abdullah Asim
    Guzel, Mehmet Serdar
    Bostanci, Erkan
    Askerzade, Iman
    IEEE ACCESS, 2020, 8 (08): : 100631 - 100644
  • [26] Facial Expression Recognition Using Pose-Guided Face Alignment and Discriminative Features Based on Deep Learning
    Liu, Jun
    Feng, Yanjun
    Wang, Hongxia
    IEEE ACCESS, 2021, 9 : 69267 - 69277
  • [27] Video-Based Contactless Detection of Sleep Apnea With Deep-Learning Model
    Chiu, Li-Wen
    Chou, Yang-Ren
    Wu, Yi-Chiao
    Chung, Meng-Liang
    Wu, Bing-Fei
    Chou, Kun-Ta
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73
  • [28] Comparison of Deep Learning Techniques for Video-Based Automatic Recognition of Greek Folk Dances
    Loupas, Georgios
    Pistola, Theodora
    Diplaris, Sotiris
    Ioannidis, Konstantinos
    Vrochidis, Stefanos
    Kompatsiaris, Ioannis
    MULTIMEDIA MODELING, MMM 2023, PT II, 2023, 13834 : 325 - 336
  • [29] Adaptive metric learning with deep neural networks for video-based facial expression recognition
    Liu, Xiaofeng
    Ge, Yubin
    Yang, Chao
    Jia, Ping
    JOURNAL OF ELECTRONIC IMAGING, 2018, 27 (01)
  • [30] Marine predators optimization with deep learning model for video-based facial expression recognition
    Prasad, Mal Hari
    Swarnalatha, P.
    EXPERT SYSTEMS, 2024, 41 (10)