This article aims to develop a lightweight, decentralized federated learning (FL)-based strategy for electricity theft detection (ETD). Different from most of the existing ETD solutions, which typically deploy centralized deep learning models, our proposed method utilizes well-pruned lightweight networks and operates in a completely decentralized manner while maintaining the performance of the ETD model. Specifically, to protect data privacy, a novel sequential decentralized FL (SDFL) framework was designed, eliminating the centralized parameter aggregation node in traditional FL. Each client communicates model parameters only with its neighbors and trains its model locally. In addition, to facilitate deployment on edge devices, model pruning techniques are integrated with the sequential transmission characteristics of the SDFL framework. A progressive channel pruning technique is proposed, gradually reducing the number of model channels during training to promote model compression and simplify field deployment. Experiments demonstrate that our strategy compressed the model floating point operations from 18.32 to 3.60M and reduced the number of parameters from 8.61 to 3.47M, while protecting user privacy, and maintaining good performance. Deployment results on the edge devices, i.e., Raspberry Pi, indicate that our proposed strategy reduces the model inference time from 329.35 to 141.50 s, enhancing the detection efficiency by 57.04%.