The extensive integration of deep learning within the domain of object detection has escalated the necessity for enhancing detection precision and spatial fidelity. The proposed YOLO-CA detection algorithm introduces two groundbreaking techniques: CoordConv and RoIAlign. Initially, CoordConv integrates the coordinate data of pixel locations into the convolutional neural network architecture, enabling the network to more effectively comprehend pixel-wise spatial relationships, thereby boosting performance in tasks involving spatial structures. Following this, RoIAlign is employed to address potential pixel misalignment issues associated with RoIPool, thereby escalating the spatial accuracy of the regions of interest. Empirical findings on the VOC dataset demonstrate a 0.025 increase in the mAP score for the refined model incorporating both CoordConv and RoIAlign, signifying substantial improvements in object detection performance. These innovations have been empirically validated to boost detection accuracy and spatial precision, offering novel insights and strategies for deep learning algorithms within the object detection realm, carrying both theoretical significance and practical implications.