Enhancing Human Activity Detection and Classification Using Fine Tuned Attention-Based Transformer Models

被引：0

作者：

Ram Kumar Yadav ^{[1
]}

A. Daniel ^{[1
]}

Vijay Bhaskar Semwal ^{[2
]}

机构：

[1] Amity University,Department of Computer Science and Engineering

[2] MANIT,Department of Computer Science and Engineering

来源：

SN Computer Science | / 5卷 / 8期

关键词：

Machine learning; Deep learning; HAR; Data augmentation; Attention-based transformers;

D O I：

10.1007/s42979-024-03445-5

中图分类号：

学科分类号：

摘要：

Recognition of human activity is an active research area. It uses the Internet of Things, Sensory methods, Machine Learning, and Deep Learning techniques to assist various application fields like home monitoring, robotics, surveillance, and healthcare. However, researchers face problems such as time complexity, more execution time of the model, and classification accuracy. This paper introduces a novel approach to overcome the issue as mentioned earlier by using the deep learning transformer model such as ViT(Vision Transformer), DieT(Data-efficient image Transformers), and SwinV2 transformer, which are used for image-based datasets (i.e., Standard40, MPII human pose) and VideoMAE transformer is used for video-based UCF101 and HMDB51 datasets. The approaches achieved remarkable accuracy in classifying human activities. Evaluations using the ViT, DeiT, and Swin transformer V2 with Stanford40 are 90.8%, 90.7%, and 88%; similarly, MPII Human Pose datasets show 87%, 85.6%, and 87.1%. In addition, this paper's method has achieved remarkable accuracies of 94.15% and 78.44%, respectively, when applying the VideoMAE transformer to video-based activity recognition on the UCF101 and HMDB51 datasets. These findings emphasize the efficacy of the attention-based transformer (i.e., ViT, DeiT, SwinV2, and VideoMAE) model and the novelty of earlier no-result evaluation on these various datasets with attention-based transformers.

引用

共 50 条

[41] Facial Expression Recognition Based on Fine-Tuned Channel-Spatial Attention Transformer
Yao, Huang
Yang, Xiaomeng
Chen, Di
Wang, Zhao
Tian, Yuan
SENSORS, 2023, 23 (15)
[42] Human Detection in Surveillance Videos Based on Fine-Tuned MobileNetV2 for Effective Human Classification
Bouafia, Yassine
Guezouli, Larbi
Lakhlef, Hicham
IRANIAN JOURNAL OF SCIENCE AND TECHNOLOGY-TRANSACTIONS OF ELECTRICAL ENGINEERING, 2022, 46 (04) : 971 - 988
[43] Human Detection in Surveillance Videos Based on Fine-Tuned MobileNetV2 for Effective Human Classification
Yassine Bouafia
Larbi Guezouli
Hicham Lakhlef
Iranian Journal of Science and Technology, Transactions of Electrical Engineering, 2022, 46 : 971 - 988
[44] An explainable attention-based TCN heartbeats classification model for arrhythmia detection
Zhao, Yuxuan
Ren, Jiadong
Zhang, Bing
Wu, Jinxiao
Lyu, Yongqiang
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 80
[45] SUPRAVENTRICULAR TACHYCARDIA CLASSIFICATION USING ATTENTION-BASED RESIDUAL NETWORKS
Zhang, Jiayu
Qian, Li
Hou, Xingyu
Zhu, Honglei
Wu, Xiaomei
JOURNAL OF MECHANICS IN MEDICINE AND BIOLOGY, 2021, 21 (05)
[46] DeepRan: Attention-based BiLSTM and CRF for Ransomware Early Detection and Classification
Krishna Chandra Roy
Qian Chen
Information Systems Frontiers, 2021, 23 : 299 - 315
[47] Spectro-Temporal Attention-Based Voice Activity Detection
Lee, Younglo
Min, Jeongki
Han, David K.
Ko, Hanseok
IEEE SIGNAL PROCESSING LETTERS, 2020, 27 : 131 - 135
[48] Enhancing Immoral Post Detection on Social Networks Using Knowledge Graph and Attention-Based BiLSTM Framework
Saqia, Bibi
Khan, Khairullah
Rahman, Atta Ur
Khan, Sajid Ullah
Alkhowaiter, Mohammed
Khan, Wahab
Ullah, Ashraf
IEEE ACCESS, 2024, 12 : 178345 - 178361
[49] ATTENTION-BASED MULTI-TASK LEARNING FOR FINE-GRAINED IMAGE CLASSIFICATION
Liu, Dichao
Wang, Yu
Mase, Kenji
Kato, Jien
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 1499 - 1503
[50] Enhancing Attention-Based LSTM With Position Context for Aspect-Level Sentiment Classification
Zeng, Jiangfeng
Ma, Xiao
Zhou, Ke
IEEE ACCESS, 2019, 7 : 20462 - 20471

← 1 2 3 4 5 →