When dealing with a limited number of fault samples, prevailing fault diagnosis methods often succumb to overfitting, impeding the attainment of precise fault diagnosis. Hence, this work presents a few-shot fault diagnosis model for wind turbine (WT) generators employing a Convolutional Normalization Transformer Encoder (CNTE) based on Model-Agnostic Meta-Learning (MAML). Specifically, MAML, as a meta-learner, generates numerous tasks through random sampling, while the CNTE model, serving as the base learner, integrates a one-dimensional convolutional neural network, multi-head self-attention mechanism, and batch normalization layers to simultaneously capture both global and local features of the data. Training occurs at the task level to iteratively adjust the initialization parameters of the base learner, facilitating adaptation to diverse data types. Fine-tuning of the base learner is achieved through MAML's primary gradient, while MAML's secondary gradient facilitates the learning of universal feature representations across multiple tasks, enabling rapid learning within the proposed model. Ultimately, when confronted with new tasks, the model exhibits swift adaptation and generalization, even with a limited number of samples. To validate the efficacy of the proposed method in diagnosing faults in WT generators, experiments were conducted using data from five types of generator faults generated by the FAST WT simulation platform and data from four types of generator faults obtained from actual wind farms. Additionally, the proposed method was compared and analyzed alongside seven popular current methods. Experimental results demonstrate the proposed method's capacity for accurate fault diagnosis with small sample sizes, outperforming other comparative methods.