Fruit defect classification and quality visual inspection are crucial for automated harvesting in agriculture. To address the issues of large model parameters, low target recognition accuracy, and interference from background noise in existing detection models, we proposed a novel fruit detection algorithm, AOD-Net, which integrated Polarized self-attention (PSA) mechanism and a new lightweight structure, Cross-Stage Partial Convolution (CSPC). By adding the PSA module at the end of the backbone network, AOD-Net establishes mutual dependencies between feature channels, reducing background noise and improving the network's ability to extract and recognize subtle target features, thus enhancing target localization accuracy. The CSPC structure, inspired by Dual-Conv and Partial-Conv, replaces certain convolutional layers, significantly reducing model parameters and accelerating detection speed while maintaining accuracy to meet real-time requirements. The Receptive-Field Attention Convolution module is incorporated into the neck network to enhance feature learning, improve feature extraction accuracy, and address parameter sharing issues, thus improving model generalization. Additionally, the Dysample upsampling operator replaces the traditional nearest-neighbor interpolation to reduce computational parameters while improving feature fusion for different fruit types, thereby enhancing detection accuracy and robustness. Experimental results on the publicly available FruitNet dataset showed that AOD-Net achieved a mAP of 93.55%, with improvements of 1.30%, 1.96%, and 3.95% in Precision, Recall, and mAP, respectively, compared to the standard YOLOv5s. The model's memory usage decreased by 8.97%, and the computational cost was reduced from 16.0 GFLOPs to 11.3 GFLOPs, verifying the effectiveness of the proposed algorithm. AOD-Net strikes an excellent balance between speed and accuracy, making it an efficient and practical fruit detection method.