Omni-TransPose: Fusion of OmniPose and Transformer Architecture for Improving Action Detection
被引:0
|
作者:
Phu, Khac-Anh
论文数: 0引用数: 0
h-index: 0
机构:
Hue Univ, Univ Sci, Fac Informat Technol, Hue City 530000, Vietnam
Cao Thang Tech Coll, Fac Informat Technol, Ho Chi Minh City 720000, VietnamHue Univ, Univ Sci, Fac Informat Technol, Hue City 530000, Vietnam
Phu, Khac-Anh
[1
,2
]
Hoang, Van-Dung
论文数: 0引用数: 0
h-index: 0
机构:
HCMC Univ Technol & Educ, Fac Informat Technol, Ho Chi Minh City 720000, VietnamHue Univ, Univ Sci, Fac Informat Technol, Hue City 530000, Vietnam
Hoang, Van-Dung
[3
]
Le, Van-Tuong-Lan
论文数: 0引用数: 0
h-index: 0
机构:
Hue Univ, Dept Acad & Students Affairs, Hue City 530000, VietnamHue Univ, Univ Sci, Fac Informat Technol, Hue City 530000, Vietnam
Le, Van-Tuong-Lan
[4
]
Tran, Quang-Khai
论文数: 0引用数: 0
h-index: 0
机构:
HCMC Univ Technol & Educ, Fac Informat Technol, Ho Chi Minh City 720000, VietnamHue Univ, Univ Sci, Fac Informat Technol, Hue City 530000, Vietnam
Tran, Quang-Khai
[3
]
机构:
[1] Hue Univ, Univ Sci, Fac Informat Technol, Hue City 530000, Vietnam
[2] Cao Thang Tech Coll, Fac Informat Technol, Ho Chi Minh City 720000, Vietnam
[3] HCMC Univ Technol & Educ, Fac Informat Technol, Ho Chi Minh City 720000, Vietnam
[4] Hue Univ, Dept Acad & Students Affairs, Hue City 530000, Vietnam
来源:
RECENT CHALLENGES IN INTELLIGENT INFORMATION AND DATABASE SYSTEMS, PT II, ACIIDS 2024
|
2024年
/
2145卷
关键词:
Computer vision;
Deep learning;
Skeleton data;
D O I:
10.1007/978-981-97-5934-7_6
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
The field of computer vision research has been experiencing rapid and remarkable development in recent years, aiming to analyze image and video data through increasingly sophisticated machine learning models. In this research domain, capturing and extracting relevant features plays a crucial role in approaching the detailed content and semantics of image and video data. Among these, skeleton data, with the ability to represent the position and movements of human body parts, along with its simplicity and independence from external factors, has proven highly effective in solving human action recognition problems. Consequently, many researchers have shown interest and proposed various skeleton data extraction models following different approaches. In this study, we introduce the Omni-TransPose model for skeleton data extraction, constructed by combining the OmniPose model with the Transformer architecture. We conducted experiments on the MPII dataset, using the Percentage of Correct Key Points (PCK) metric to evaluate the effectiveness of the new model. The experimental results were compared with the original OmniPose model, demonstrating a significant improvement in skeleton extraction and recognition, thereby enhancing the capability of human action recognition. This work promises to provide an efficient and powerful method for human action recognition, with broad potential applications in practical scenarios.