With the proliferation of digital education, the investigation of leveraging deep learning models to analyze the attributes of music genres embedded within music mode signals and enhance students' capacity for music appreciation has garnered considerable research attention within the realm of music pedagogy. In order to augment the efficacy of a deep learning model in capturing music mode signals and genre characteristics, this study proposes a music genre classification model that draws upon spectral and spatial domain feature attention. Initially, the original music mode signal undergoes filtering, and subsequently, the resultant music Mel spectrogram is partitioned and fed into the network. Furthermore, the model bolsters genre feature extraction by modifying the convolutional structure and intensifying spatial domain attention. Experimental findings substantiate that our model outperforms alternative approaches, demonstrating enhanced accuracy and convergence in music genre classification, thereby yielding a surge in accuracy ranging from 5.36 percentage points to 10.44 percentage points. Consequently, our model attains precise extraction of music mode signals, facilitates genre classification, and significantly enhances the efficacy of digital music instruction.