Traditional dynamometer card sensors are costly and complex, making them unsuitable for real-time sucker rod pumping (SRP) well diagnostics. Recently, SRP diagnosis models using motor power curves offer an alternative, but irregular power curves and limited labeled data present challenges. To address this, we propose a deep transfer model enhanced by Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP) data augmentation. First, a physics-driven model reconstructs polished rod torque from motor power, emphasizing fault features. Second, WGAN-GP generates faulty SRP torque-displacement samples to expand training data. Finally, a fully parameter-tuned deep transfer SRP diagnosis framework is established, which improves the automatic learning of advanced fault features and enhances diagnostic accuracy using the augmented dataset. Experiments confirm the model's superior performance and generalization.