Facial expression recognition under classroom scenes can help the teacher to understand students' classroom learning status and improve teaching effectiveness. Aiming at the problem of low expression recognition accuracy in classroom scenarios, a novel multi-scale facial expression recognition algorithm based on improved Res2Net is proposed. Firstly, a bi-directional residual BiRes2Net module is proposed to achieve bi-directional multi-scale expression feature extraction at the fine-grained level, while a short-directed connection path is introduced to make the network have the self-closing capability and avoid extracting redundant information of expressions; Then the Fine-Grained Coordinate Attention (FGCA) mechanism is embedded to extract expression spatial location features and channel features at a fine-grained level by making full use of the prior knowledge of facial expressions; Finally, a multi-classification Focalloss loss function is used to alleviate the imbalance of expression data, and different weights are assigned to expression samples with different recognition difficulty so that the network is biased towards difficult sample feature extraction. The experimental results show that the recognition accuracy of the proposed method is 79.47%, 94.06%, and 96.67% in RAF-DB, JAFFE, and CK+ datasets respectively, and up to 72.71% in real classroom scenes, which are better than other comparative algorithms significantly.