Human pose estimation is fundamental to many computer vision tasks and has made significant progress in recent years. However, the problem of unbalanced performance among joints has not been paid enough attention. Basing on simple baseline Xiao et al. (Proceedings of the European conference on computer vision, 2018), we propose a weighted summation method of local keypoint, selective receptive field (SRF) unit and use the feature fuse method to tackle this problem. Initially, the weighted summation method of local keypoint is designed to make the network explicitly address keypoints with large loss value. This method calculation weights according to the loss value of each joint. Subsequently, the SRF unit was proposed to adaptively select receptive field size for keypoints. Firstly, multiple branches with different kernel sizes are compared using softmax attention. Secondly, the Select operator chooses one of these branches to yield effective receptive fields. Then, the features coming from the encoder are merged in the decoder using concatenation to solve the occlusion joint. This method enhances communication between spatial information and semantic information. The experimental results show that as a model-agnostic approach, our method promotes SimpleBaseline-50-256×192\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$50-256\times 192$$\end{document} by 4.3 AP on COCO validation set. Extensive experiments demonstrate that the proposed approach is superior to several state-of-the-art methods in terms of accuracy and robustness.