Aiming at the problems of low precision, slow speed and difficult detection of small target pear fruit in a real environment, this paper designs a pear fruit detection model in a natural environment based on a lightweight Transformer architecture based on the RT-DETR model. Meanwhile, Xinli No. 7 fruit data set with different environmental conditions is established. First, based on the original model, the backbone was replaced with a lightweight FasterNet network. Secondly, HiLo, an improved and efficient attention mechanism with high and low-frequency information extraction, was used to make the model lightweight and improve the feature extraction ability of Xinli No. 7 in complex environments. The CCFM module is reconstructed based on the Slim-Neck method, and the loss function of the original model is replaced with the Shape-NWD small target detection mechanism loss function to enhance the feature extraction capability of the network. The comparison test between RT-DETR and YOLOv5m, YOLOv7, YOLOv8m and YOLOv10m, Deformable-DETR models shows that RT-DETR can achieve a good balance in terms of model lightweight and recognition accuracy compared with other models, and comprehensively exceed the detection accuracy of the current advanced YOLOv10 algorithm, which can realize the rapid detection of Xinli No. 7 fruit. In this paper, the accuracy rate, recall rate and average accuracy of the improved model reached 93.7%, 91.9% and 98%, respectively, and compared with the original model, the number of params, calculation amount and weight memory was reduced by 48.47%, 56.2% and 48.31%, respectively. This model provides technical support for Xinli No. 7 fruit detection and model deployment in complex environments.