The detection of prohibited items in X-ray security images has always been an important research direction in the field of object detection. In reality, security check items have different appearances and large scale changes. The penetration of X-rays causes problems such as occlusion and overlap between the inspected items. Traditional object detection models are prone to false positives and missed detections. A feature enhancement fusion network called MSANet is proposed to address the above issues by integrating multi-scale features and three attention mechanisms. Based on the ConvNeXt V2 network, adding attention stacking modules(AS), the attention information at the coordinate, channel, and spatial levels is fused to enhance the network's feature extraction ability. Then, using atrous spatial pyramid pooling(ASPP) for multi-scale receptive field information fusion, features containing global contextual information are obtained to provide global information guidance for occluded areas. Finally, to address the issue of varying scales of prohibited items, multi-scale feature pyramids are utilized to fuse local information with global contextual information extracted by the ASPP, enhancing the scale and spatial perception capabilities of the object being detected. Training and validation were conducted on the SIXray-Z dataset and HiXray-Z dataset, the average accuracy reached 85.66% and 71.55% respectively, both better than the original network. It can effectively improve the ability of ConvNeXt V2 to detect complex prohibited items.