Structure-Enriched Topology Learning For Cross-Domain Multi-Person Pose Estimation

被引:0
作者
Xu, Xixia [1 ]
Zou, Qi [1 ]
Lin, Xue [1 ]
机构
[1] Beijing Jiaotong Univ, Sch Comp & Informat Technol, Beijing Key Lab Traff Data Anal & Min, Beijing 100044, Peoples R China
基金
北京市自然科学基金;
关键词
Pose estimation; Semantics; Training; Topology; Heating systems; Adaptation models; Annotations; Adaptive human-topolopy learning; domain adaptation; multi-person pose estimation; NETWORK;
D O I
10.1109/TMM.2022.3207578
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Human pose estimation has been widely studied with much focus on supervised learning. However, in real applications, a pretrained pose estimation model usually needs be adapted to a novel domain without labels or with sparse labels. Existing domain adaptation methods cannot well deal with it since poses have flexible topological structures and need fine-grained local features. Aiming at the characteristics of human pose, we propose a novel domain adaptation method for multi-person pose estimation (MPPE) to alleviate the human-level shift. Firstly, the training samples of human poses are clustered into groups according to the posture similarity. Within the clustered space, we conduct three adaptation modules: Cross-Attentive Feature Alignment (CAFA), Intra-domain Structure Adaptation (ISA) and Adaptive Human-Topology Adaptation (AHTA). The CAFA adopts a bidirectional spatial attention mechanism to explore fine-grained local feature correlation between two humans, and thus to adaptively aggregate consistent features for adaptation. ISA only works in semi-supervised domain adaptation (SSDA) to exploit semantic relationship of corresponding keypoints for reducing the intra-domain bias. Importantly, we creatively propose an AHTA to enrich human topological knowledge for reducing the inter-domain discrepancy. Specifically, the pose structure and the cross-instance topological relations are modeled via graph networks. This flexible topology learning benefits the occluded or extreme pose inference. Extensive experiments are conducted on two popular benchmarks and additional two challenging datasets. Results demonstrate the competency of our method, which works in unsupervised or semi-supervised modes, compared with the existing supervised approaches.
引用
收藏
页码:6272 / 6284
页数:13
相关论文
共 82 条
  • [31] CrowdPose: Efficient Crowded Scenes Pose Estimation and A New Benchmark
    Li, Jiefeng
    Wang, Can
    Zhu, Hao
    Mao, Yihuan
    Fang, Hao-Shu
    Lu, Cewu
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10855 - 10864
  • [32] Pose Recognition with Cascade Transformers
    Li, Ke
    Wang, Shijie
    Zhang, Xiang
    Xu, Yifan
    Xu, Weijian
    Tu, Zhuowen
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 1944 - 1953
  • [33] Multi-Person Pose Estimation Using Bounding Box Constraint and LSTM
    Li, Miaopeng
    Zhou, Zimeng
    Liu, Xinguo
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (10) : 2653 - 2663
  • [34] TokenPose: Learning Keypoint Tokens for Human Pose Estimation
    Li, Yanjie
    Zhang, Shoukui
    Wang, Zhicheng
    Yang, Sen
    Yang, Wankou
    Xia, Shu-Tao
    Zhou, Erjin
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 11293 - 11302
  • [35] Microsoft COCO: Common Objects in Context
    Lin, Tsung-Yi
    Maire, Michael
    Belongie, Serge
    Hays, James
    Perona, Pietro
    Ramanan, Deva
    Dollar, Piotr
    Zitnick, C. Lawrence
    [J]. COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 : 740 - 755
  • [36] Lingteng Qiu, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12364), P488, DOI 10.1007/978-3-030-58529-7_29
  • [37] Long MS, 2018, ADV NEUR IN, V31
  • [38] Long MS, 2015, PR MACH LEARN RES, V37, P97
  • [39] Luo Z., 2017, Advances in Neural Information Processing Systems, P165
  • [40] Human pose regression by combining indirect part detection and contextual information
    Luvizon, Diogo C.
    Labia, Hedi
    Picard, David
    [J]. COMPUTERS & GRAPHICS-UK, 2019, 85 : 15 - 22