The photovoltaic technology industry is a key development field in response to global renewable energy demands. The efficiency of fault detection in solar cells, a core component, is vital. Traditional manual fault detection is inefficient and costly, and existing deep learning models lack accuracy and speed. To address these problems, this study proposes the ESD-YOLOv8 model, which is optimised for infrared solar cell images captured by UAVs and is able to efficiently identify microdefect features. The detection of small defects is enhanced by optimising the YOLOv8 architecture, removing the P5 layer, introducing the small target sensitive P2 layer, and integrating the EMA attention mechanism and the C2f_EMA module. Meanwhile, guided by the CloAttention mechanism, a feature fusion layer has been designed to focus the model's attention on small target defect features in the P2 layer, thereby improving the accuracy of defect location. The Unified IoU (UIoU) metric is employed to optimise the loss function and enhance the accuracy of fault prediction. The results of the performance test demonstrate that the F1 Score of ESD-YOLOv8 in mAP@0.5 reaches 91.8% and mAP@0.5:0.95 reaches 58.0%. This indicates that the system performs well in terms of latency and computational resource requirements, meeting the requirements of actual production for fault detection with efficient realtime detection capability. This study not only alleviates the burden of human detection but also provides an efficient and high-precision solution for intelligent PV system fault diagnosis.