Automatic defect detection on the metal surface is a vital task for product inspection in industrial assembly lines or production processes. Owing to miscellaneous patterns of defects, interclass similarity, intraclass difference, and fewer defect samples, achieving accurate and automatic detection remains a big challenge. What is more, since the rising demand for production efficiency, real-time detection is increasingly desirable. This article proposes a semantic prior and extremely efficient dilated convolution network, named SPEED, for pixel-wise detection on the metal surface, which aims to address the aforementioned issues. The architecture of SPEED involves the following: 1) a semantic prior (SP) branch, with shallow layer and prior mapping module to capture low-level details; and 2) an extremely efficient dilation (EED) branch, with lightweight bottleneck to obtain high-level context. Furthermore, an aggregation module is designed to fuse both types of feature representation. Additionally, different level features of bottleneck are fused to improve the segmentation performance. Experimental results on three metal surface defect datasets indicate that the proposed method outperforms the state-of-the-art approaches in terms of the mean intersection of union, model parameters, FLOPs, and FPS. More specifically, SPEED achieves 92.34% mIoU on NEU-Seg, 88.65% mIoU on Severstal Strip Steel, and 63.91% mIoU on MT Defect.