MixPose: 3D Human Pose Estimation with Mixed Encoder

被引：0

作者：

Cheng, Jisheng ^{[1
,2
,3
]}

Cheng, Qin ^{[1
,3
]}

Yang, Mengjie ^{[4
]}

Liu, Zhen ^{[1
,3
]}

Zhang, Qieshi ^{[1
,3
]}

Cheng, Jun ^{[1
,3
]}

机构：

[1] Chinese Acad Sci, Guangdong Prov Key Lab Robot & Intelligent Syst, Shenzhen Inst Adv Technol, Shenzhen, Peoples R China

[2] Univ Chinese Acad Beijing, Beijing, Peoples R China

[3] Chinese Univ Hong Kong, Hong Kong, Peoples R China

[4] Shine Technol Co Ltd, Beijing, Peoples R China

来源：

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VIII | 2024年 / 14432卷

基金：

中国国家自然科学基金;

关键词：

3D human pose estimation; Transformer; Mixed encoder;

D O I：

10.1007/978-981-99-8543-2_29

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The fusion of spatio-temporal information is crucial for 3D human pose estimation in video. Existing methods usually extract temporal information from the spatially encoded poses, which may lead to limited spatio-temporal information interaction. To address this issue, we propose MixPose, a novel network for 3D human pose estimation with mixed encoder in videos. We introduce independent mixed encoders to fuse spatio-temporal information in the sequence, and augment the perception of each point with global information using an attention module. We evaluate MixPose on two public datasets, Human3.6M and HumanEva, experiment results show that MixPose outperforms other state-of-the-art methods in specific scenarios.

引用

页码：353 / 364

页数：12

共 50 条

[31] HOGFormer: high-order graph convolution transformer for 3D human pose estimation [J].

Xie, Yuhong ;

Hong, Chaoqun ;

Zhuang, Weiwei ;

Liu, Lijuan ;

Li, Jie .

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2025, 16 (01) :599-610

[32] A Survey of the State of the Art in Monocular 3D Human Pose Estimation: Methods, Benchmarks, and Challenges [J].

Guo, Yan ;

Gao, Tianhan ;

Dong, Aoshuang ;

Jiang, Xinbei ;

Zhu, Zichen ;

Wang, Fuxin .

SENSORS, 2025, 25 (08)

[33] Attention-based feature enhancement for direct multiview 3D human pose estimation [J].

Song, Peiling ;

Zhu, Xuan ;

Zhao, Xingwang ;

Lei, Jingjing ;

Zhu, Jiahao ;

Dang, Qian ;

Wang, Lin .

JOURNAL OF ELECTRONIC IMAGING, 2025, 34 (03)

[34] Multi-scale Feature Injection for Occluded 3D Human Pose and Shape Estimation [J].

Shi, Yunhui ;

Ge, Yangyang ;

Wang, Jin .

2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, :4881-4886

[35] Efficient Hierarchical Multi-view Fusion Transformer for 3D Human Pose Estimation [J].

Zhou, Kangkang ;

Zhang, Lijun ;

Lu, Feng ;

Zhou, Xiang-Dong ;

Shi, Yu .

PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, :7512-7520

[36] Spatio-Temporal Dynamic Interlaced Network for 3D human pose estimation in video [J].

Xu, Feiyi ;

Wang, Jifan ;

Sun, Ying ;

Qi, Jin ;

Dong, Zhenjiang ;

Sun, Yanfei .

COMPUTER VISION AND IMAGE UNDERSTANDING, 2025, 251

[37] Hierarchical Spatial-Temporal Adaptive Graph Fusion for Monocular 3D Human Pose Estimation [J].

Zhang, Lijun ;

Lu, Feng ;

Zhou, Kangkang ;

Zhou, Xiang-Dong ;

Shi, Yu .

IEEE SIGNAL PROCESSING LETTERS, 2024, 31 :61-65

[38] TGST: A transformer-graph framework for enhanced spatiotemporal modeling in 3D human pose estimation [J].

Yang, Aolei ;

Zhou, Yinghong ;

Lv, Chenchen ;

Yang, Banghua ;

Miao, Zhonghua ;

Fei, Minrui .

VISUAL COMPUTER, 2025,

[39] Multi-hypothesis representation learning for transformer-based 3D human pose estimation [J].

Li, Wenhao ;

Liu, Hong ;

Tang, Hao ;

Wang, Pichao .

PATTERN RECOGNITION, 2023, 141

[40] Exploiting Static and Dynamic Human Joint Relations for 3D Pose Estimation via Cascade Transformers [J].

Song, Bo ;

Ji, Changjiang ;

Fan, Shuo .

2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, :4522-4528

← 1 2 3 4 5 →