MixPose: 3D Human Pose Estimation with Mixed Encoder

被引:0
作者
Cheng, Jisheng [1 ,2 ,3 ]
Cheng, Qin [1 ,3 ]
Yang, Mengjie [4 ]
Liu, Zhen [1 ,3 ]
Zhang, Qieshi [1 ,3 ]
Cheng, Jun [1 ,3 ]
机构
[1] Chinese Acad Sci, Guangdong Prov Key Lab Robot & Intelligent Syst, Shenzhen Inst Adv Technol, Shenzhen, Peoples R China
[2] Univ Chinese Acad Beijing, Beijing, Peoples R China
[3] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[4] Shine Technol Co Ltd, Beijing, Peoples R China
来源
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VIII | 2024年 / 14432卷
基金
中国国家自然科学基金;
关键词
3D human pose estimation; Transformer; Mixed encoder;
D O I
10.1007/978-981-99-8543-2_29
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The fusion of spatio-temporal information is crucial for 3D human pose estimation in video. Existing methods usually extract temporal information from the spatially encoded poses, which may lead to limited spatio-temporal information interaction. To address this issue, we propose MixPose, a novel network for 3D human pose estimation with mixed encoder in videos. We introduce independent mixed encoders to fuse spatio-temporal information in the sequence, and augment the perception of each point with global information using an attention module. We evaluate MixPose on two public datasets, Human3.6M and HumanEva, experiment results show that MixPose outperforms other state-of-the-art methods in specific scenarios.
引用
收藏
页码:353 / 364
页数:12
相关论文
共 50 条
[1]   Global and Local Spatio-Temporal Encoder for 3D Human Pose Estimation [J].
Wang, Yong ;
Kang, Hongbo ;
Wu, Doudou ;
Yang, Wenming ;
Zhang, Longbin .
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 :4039-4049
[2]   SlowFastFormer for 3D human pose estimation [J].
Zhou, Lu ;
Chen, Yingying ;
Wang, Jinqiao .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 243
[3]   Group Spatial Attention for 3D Human Pose Estimation [J].
Tran, Tien-Dat ;
Cao, Ge ;
Ashraf, Russo ;
Jo, Kang-Hyun .
2024 33RD INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS, ISIE 2024, 2024,
[4]   PoseGate-Former: Transformer Encoder with Trainable Gate for 3D Human Pose Estimation Using Weakly Supervised Learning [J].
Guan, Shannan ;
Lu, Haiyan ;
Zhu, Linchao ;
Fang, Gengfa .
NEURAL INFORMATION PROCESSING, ICONIP 2021, PT VI, 2022, 1517 :266-274
[5]   HYRE: Hybrid Regressor for 3D Human Pose and Shape Estimation [J].
Li, Wenhao ;
Liu, Mengyuan ;
Liu, Hong ;
Ren, Bin ;
Li, Xia ;
You, Yingxuan ;
Sebe, Nicu .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 :235-246
[6]   MULTI HYBRID EXTRACTOR NETWORK FOR 3D HUMAN POSE ESTIMATION [J].
Yuan, Zhixiang ;
Zhang, Xitie ;
Wu, Suping ;
Zhang, Boyang ;
Peng, Yuxin ;
Wang, Bing .
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, :3170-3174
[7]   Dual-Path Transformer for 3D Human Pose Estimation [J].
Zhou, Lu ;
Chen, Yingying ;
Wang, Jinqiao .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) :3260-3270
[8]   TSwinPose: Enhanced monocular 3D human pose estimation with JointFlow [J].
Li, Muyu ;
Hu, Henan ;
Xiong, Jingjing ;
Zhao, Xudong ;
Yan, Hong .
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
[9]   DGFormer: Dynamic graph transformer for 3D human pose estimation [J].
Chen, Zhangmeng ;
Dai, Ju ;
Bai, Junxuan ;
Pan, Junjun .
PATTERN RECOGNITION, 2024, 152
[10]   LEARNING MONOCULAR 3D HUMAN POSE ESTIMATION WITH SKELETAL INTERPOLATION [J].
Chen, Ziyi ;
Sugimoto, Akihiro ;
Lai, Shang-Hong .
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, :4218-4222