MixPose: 3D Human Pose Estimation with Mixed Encoder

被引：0

作者：

Cheng, Jisheng ^{[1
,2
,3
]}

Cheng, Qin ^{[1
,3
]}

Yang, Mengjie ^{[4
]}

Liu, Zhen ^{[1
,3
]}

Zhang, Qieshi ^{[1
,3
]}

Cheng, Jun ^{[1
,3
]}

机构：

[1] Chinese Acad Sci, Guangdong Prov Key Lab Robot & Intelligent Syst, Shenzhen Inst Adv Technol, Shenzhen, Peoples R China

[2] Univ Chinese Acad Beijing, Beijing, Peoples R China

[3] Chinese Univ Hong Kong, Hong Kong, Peoples R China

[4] Shine Technol Co Ltd, Beijing, Peoples R China

来源：

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VIII | 2024年 / 14432卷

基金：

中国国家自然科学基金;

关键词：

3D human pose estimation; Transformer; Mixed encoder;

D O I：

10.1007/978-981-99-8543-2_29

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The fusion of spatio-temporal information is crucial for 3D human pose estimation in video. Existing methods usually extract temporal information from the spatially encoded poses, which may lead to limited spatio-temporal information interaction. To address this issue, we propose MixPose, a novel network for 3D human pose estimation with mixed encoder in videos. We introduce independent mixed encoders to fuse spatio-temporal information in the sequence, and augment the perception of each point with global information using an attention module. We evaluate MixPose on two public datasets, Human3.6M and HumanEva, experiment results show that MixPose outperforms other state-of-the-art methods in specific scenarios.

引用

页码：353 / 364

页数：12

共 50 条

[41] GRAPH ATTENTION CONVOLUTIONAL NETWORK FOR 3D HUMAN POSE AND SHAPE ESTIMATION FROM POINT CLOUDS [J].

Fan, Yung-Wei ;

Huang, Sheng-Chun ;

Chien, Shao-Yi .

2024 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME 2024, 2024,

[42] Learning the Dynamic Spatio-Temporal Relationship Between Joints for 3D Human Pose Estimation [J].

Xu, Feiyi ;

Sun, Ying ;

Qi, Jin ;

Sun, Yanfei .

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VI, 2025, 15036 :269-284

[43] BCDPose: Diffusion-based 3D Human Pose Estimation with bone-chain prior knowledge [J].

Liu, Xing ;

Tang, Hao .

IMAGE AND VISION COMPUTING, 2025, 162

[44] Efficient 3D human pose estimation for IoT-based motion capture using Spatiotemporal Attention [J].

Zhang, Chen ;

Li, Luyan ;

Zhang, Zhihao ;

Zhou, Yan .

ALEXANDRIA ENGINEERING JOURNAL, 2025, 129 :67-76

[45] 3D Human Pose Estimation based on fused Spatio-Temporal Attention and Graph Convolutional Networks [J].

Xue, Yubo ;

Chen, Gang ;

Li, Zhanbo ;

Zhang, Cong .

PROCEEDINGS OF 2025 5TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND INTELLIGENT COMPUTING, BIC 2025, 2025, :545-550

[46] DBMHT: A double-branch multi-hypothesis transformer for 3D human pose estimation in video [J].

Xiang, Xuezhi ;

Li, Xiaoheng ;

Bao, Weijie ;

Qiaoa, Yulong ;

El Saddik, Abdulmotaleb .

COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 249

[47] 3D hand pose estimation method based on monocular RGB images [J].

Yang, Bing ;

Xu, Chuyang ;

Yao, Jinliang ;

Xiang, Xueqin .

Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2025, 59 (01) :18-26

[48] Corn pose estimation using 3D object detection and stereo images [J].

Gao, Yuliang ;

Li, Zhen ;

Hong, Qingqing ;

Li, Bin ;

Zhang, Lifeng .

COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2025, 231

[49] HandDAGT: A Denoising Adaptive Graph Transformer for 3D Hand Pose Estimation [J].

Cheng, Wencan ;

Kim, Eunji ;

Ko, Jong Hwan .

COMPUTER VISION - ECCV 2024, PT LXXXVIII, 2025, 15146 :35-52

[50] Diffusion-Based Hypotheses Generation and Joint-Level Hypotheses Aggregation for 3D Human Pose Estimation [J].

Shan, Wenkang ;

Zhang, Yuhuai ;

Zhang, Xinfeng ;

Wang, Shanshe ;

Zhou, Xilong ;

Ma, Siwei ;

Gao, Wen .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (11) :10678-10691

← 1 2 3 4 5 →