Toward Automated Classroom Observation: Multimodal Machine Learning to Estimate CLASS Positive Climate and Negative Climate

被引：18

作者：

Ramakrishnan, Anand ^{[1
]}

Zylich, Brian ^{[1
]}

Ottmar, Erin ^{[1
]}

LoCasale-Crouch, Jennifer ^{[2
]}

Whitehill, Jacob ^{[1
]}

机构：

[1] Worcester Polytech Inst, Worcester, MA 01609 USA

[2] Univ Virginia, Charlottesville, VA 22807 USA

来源：

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING | 2023年 / 14卷 / 01期

基金：

美国国家科学基金会;

关键词：

Videos; Meteorology; Encoding; Machine learning; Activity recognition; Computer vision; Computer architecture; Automatic classroom observation; classroom assessment scoring system; facial expression recognition; auditory analysis; CHILD-CARE; ENGAGEMENT; QUALITY;

D O I：

10.1109/TAFFC.2021.3059209

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this article we present a multi-modal machine learning-based system, which we call ACORN, to analyze videos of school classrooms for the Positive Climate (PC) and Negative Climate (NC) dimensions of the CLASS [1] observation protocol that is widely used in educational research. ACORN uses convolutional neural networks to analyze spectral audio features, the faces of teachers and students, and the pixels of each image frame, and then integrates this information over time using Temporal Convolutional Networks. The audiovisual ACORN's PC and NC predictions have Pearson correlations of 0.55 and 0.63 with ground-truth scores provided by expert CLASS coders on the UVA Toddler dataset (cross-validation on n 1/4 300 15-min video segments), and a purely auditory ACORN predicts PC and NC with correlations of 0.36 and 0.41 on the MET dataset (test set of n 1/4 2000 videos segments). These numbers are similar to inter-coder reliability of human coders. Finally, using Graph Convolutional Networks we make early strides (AUC=0.70) toward predicting the specific moments (45-90sec clips) when the PC is particularly weak/strong. Our findings inform the design of automatic classroom observation and also more general video activity recognition and summary recognition systems.

引用

页码：664 / 679

页数：16

共 71 条

[1] EduSense: Practical classroom sensing at scale [J].

Ahuja, Karan ;

Kim, Dohyun ;

Xhakaj, Franceska ;

Varca, Virac ;

Xie, Anne ;

Zhang, Stanley ;

Townsend, Jay Eric ;

Harrison, Chris ;

Ocan, Amy ;

Acarwal, Yuvraj .

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2019, 3 (03)

[2]

[Anonymous], 1984, CLASSIFICATION REGRE, DOI [10.1201/9781315139470-8, DOI 10.1201/9781315139470]

[3] Emotion Sensors Go To School [J].

Arroyo, Ivon ;

Cooper, David G. ;

Burleson, Winslow ;

Woolf, Beverly Park ;

Muldner, Kasia ;

Christopherson, Robert .

ARTIFICIAL INTELLIGENCE IN EDUCATION: BUILDING LEARNING SYSTEMS THAT CARE: FROM KNOWLEDGE REPRESENTATION TO AFFECTIVE MODELLING, 2009, 200 :17-+

[4] Harnessing Label Uncertainty to Improve Modeling: An Application to Student Engagement Recognition [J].

Aung, Arkar Min ;

Whitehill, Jacob R. .

PROCEEDINGS 2018 13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2018), 2018, :166-170

[5]

Bai SJ, 2018, Arxiv, DOI [arXiv:1803.01271, DOI 10.48550/ARXIV.1803.01271]

[6]

Baltrusaitis T, 2016, IEEE WINT CONF APPL

[7] Attention Augmented Convolutional Networks [J].

Bello, Irwan ;

Zoph, Barret ;

Vaswani, Ashish ;

Shlens, Jonathon ;

Le, Quoc V. .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :3285-3294

[8]

Bill and M. G. Foundation, 2014, MEAS EFF TEACH 2 COR

[9]

Bosch N., 2015, P 20 INT C INTELLIGE, P379

[10]

Bosch N., IEEE T AFFEC TIVE CO, DOI [10.1109/TAFFC:2019.2908837, DOI 10.1109/TAFFC:2019.2908837]

← 1 2 3 4 5 6 7 8 →