Deep learning (DL) models have achieved great success in fault diagnosis (FD). However, most existing DL models are memory-intensive and computationally expensive. These DL-based FD methods suffer from three drawbacks when deploying them on edge devices with limited resources in real industrial fields: large model sizes, high computational demands, and long inference times. To address these drawbacks, a sparse convolutional neural network (CNN) framework is proposed for efficient FD where models have to run on edge devices. First, a standard CNN is built and trained until convergence on the target FD task. Then, a Taylor expansion-based criterion is designed to evaluate the importance of convolutional filters. As a result, the unimportant and redundant filters of the trained CNN can be iteratively pruned. Its remaining filters constitute a sparse CNN (SCNN) which is smaller and faster to run. Finally, the SCNN is applied to efficiently diagnose faults. The effectiveness of the proposed framework is demonstrated on two public disassembling part datasets and a real industrial oxygen compressor dataset. Experimental results indicate that the proposed SCNN framework significantly improves memory and computational performance on edge devices compared to standard CNNs, while still achieving superior accuracy relative to state-of-the-art methods.