With the rapid development of artificial intelligence technology, Amethods for classroom interaction behavior analysis based on computer vision and deep learning have gradually become an important direction in educational research. Classroom interaction behavior is a crucial indicator of teaching effectiveness and student learning progress. Traditional classroom observation methods fail to meet the demands of real-time monitoring, Aaccuracy, and comprehensiveness. Image-based interaction behavior analysis can improve the precision and efficiency of classroom assessments through automation. Existing research mainly focuses on the recognition and analysis of interaction behaviors, but challenges such as insufficient detection accuracy, a lack of dynamic spatiotemporal information integration in behavior recognition, and poor algorithm real-time performance still exist. Therefore, improving target detection accuracy and behavior recognition, Aespecially in complex classroom environments, remains a critical research challenge. This paper proposes an improved YOLOv5-based classroom interaction object detection algorithm to address accuracy and real-time issues in recognizing interaction objects in complex classroom settings. Additionally, the paper presents an interaction behavior recognition method based on dynamic spatiotemporal information fusion, which enhances behavior recognition accuracy and robustness by integrating spatiotemporal features. The improved algorithm framework effectively enhances the precision and efficiency of interaction behavior analysis, providing technical support for intelligent evaluation of teaching processes and personalized education.