MixPose: 3D Human Pose Estimation with Mixed Encoder

被引:0
作者
Cheng, Jisheng [1 ,2 ,3 ]
Cheng, Qin [1 ,3 ]
Yang, Mengjie [4 ]
Liu, Zhen [1 ,3 ]
Zhang, Qieshi [1 ,3 ]
Cheng, Jun [1 ,3 ]
机构
[1] Chinese Acad Sci, Guangdong Prov Key Lab Robot & Intelligent Syst, Shenzhen Inst Adv Technol, Shenzhen, Peoples R China
[2] Univ Chinese Acad Beijing, Beijing, Peoples R China
[3] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[4] Shine Technol Co Ltd, Beijing, Peoples R China
来源
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VIII | 2024年 / 14432卷
基金
中国国家自然科学基金;
关键词
3D human pose estimation; Transformer; Mixed encoder;
D O I
10.1007/978-981-99-8543-2_29
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The fusion of spatio-temporal information is crucial for 3D human pose estimation in video. Existing methods usually extract temporal information from the spatially encoded poses, which may lead to limited spatio-temporal information interaction. To address this issue, we propose MixPose, a novel network for 3D human pose estimation with mixed encoder in videos. We introduce independent mixed encoders to fuse spatio-temporal information in the sequence, and augment the perception of each point with global information using an attention module. We evaluate MixPose on two public datasets, Human3.6M and HumanEva, experiment results show that MixPose outperforms other state-of-the-art methods in specific scenarios.
引用
收藏
页码:353 / 364
页数:12
相关论文
共 50 条
[21]   SCALE-Pose: Skeletal Correction and Language Knowledge-assisted for 3D Human Pose Estimation [J].
Ma, Xinnan ;
Li, Yaochen ;
Zhao, Limeng ;
Zhou, ChenXu ;
Xu, Yuncheng .
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XI, 2025, 15041 :578-592
[22]   HDPose: Post-Hierarchical Diffusion with Conditioning for 3D Human Pose Estimation [J].
Lee, Donghoon ;
Kim, Jaeho .
SENSORS, 2024, 24 (03)
[23]   Utilizing Spatial Transformers and GRU for Temporal Context in 3D Human Pose Estimation [J].
Cheng, Chen ;
Xu, Huahu ;
Kang, Jian .
2024 IEEE INTERNATIONAL CONFERENCE ON COGNITIVE COMPUTING AND COMPLEX DATA, ICCD, 2024, :31-35
[24]   Parallel-branch network for 3D human pose and shape estimation in video [J].
Wu, Yuanhao ;
Wang, Chenxing .
COMPUTER ANIMATION AND VIRTUAL WORLDS, 2022, 33 (3-4)
[25]   Bidirectional temporal feature for 3D human pose and shape estimation from a video [J].
Sun, Libo ;
Tang, Ting ;
Qu, Yuke ;
Qin, Wenhu .
COMPUTER ANIMATION AND VIRTUAL WORLDS, 2023, 34 (3-4)
[26]   Frame-Padded Multiscale Transformer for Monocular 3D Human Pose Estimation [J].
Zhong, Yuanhong ;
Yang, Guangxia ;
Zhong, Daidi ;
Yang, Xun ;
Wang, Shanshan .
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 :6191-6201
[27]   A Novel Auxiliary Task Framework in 3D Human Pose Estimation for Opera Videos [J].
Cai, Xingquan ;
Zhang, Haoyu ;
He, Shanshan ;
Song, Haoyu ;
Sun, Haiyan .
PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, :202-210
[28]   Multi-hop graph transformer network for 3D human pose estimation [J].
Islam, Zaedul ;
Ben Hamza, A. .
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 101
[29]   SCGFormer: Semantic Chebyshev Graph Convolution Transformer for 3D Human Pose Estimation [J].
Liang, Jiayao ;
Yin, Mengxiao .
APPLIED SCIENCES-BASEL, 2024, 14 (04)
[30]   STRFormer: Spatial-Temporal-ReTemporal Transformer for 3D human pose estimation [J].
Liu, Xing ;
Tang, Hao .
IMAGE AND VISION COMPUTING, 2023, 140