Using machine learning to identify the most at-risk students in physics classes

被引:18
作者
Yang, Jie [1 ]
DeVore, Seth [1 ]
Hewagallage, Dona [1 ]
Miller, Paul [1 ]
Ryan, Qing X. [2 ]
Stewart, John [1 ]
机构
[1] West Virginia Univ, Dept Phys & Astron, Morgantown, WV 26506 USA
[2] Calif State Polytech Univ Pomona, Dept Phys & Astron, Pomona, CA 91768 USA
基金
美国国家科学基金会;
关键词
PERFORMANCE; ANALYTICS; STEM;
D O I
10.1103/PhysRevPhysEducRes.16.020130
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Machine learning algorithms have recently been used to predict students' performance in an introductory physics class. The prediction model classified students as those likely to receive an A or B or students likely to receive a grade of C, D, F or withdraw from the class. Early prediction could better allow the direction of educational interventions and the allocation of educational resources. However, the performance metrics used in that study become unreliable when used to classify whether a student would receive an A, B, or C (the ABC outcome) or if they would receive a D, F or withdraw (W) from the class (the DFW outcome) because the outcome is substantially unbalanced with between 10% to 20% of the students receiving a D, F, or W. This work presents techniques to adjust the prediction models and alternate model performance metrics more appropriate for unbalanced outcome variables. These techniques were applied to three samples drawn from introductory mechanics classes at two institutions (N = 7184, 1683, and 926). Applying the same methods as the earlier study produced a classifier that was very inaccurate, classifying only 16% of the DFW cases correctly; tuning the model increased the DFW classification accuracy to 43%. Using a combination of institutional and in-class data improved DFW accuracy to 53% by the second week of class. As in the prior study, demographic variables such as gender, underrepresented minority status, fast-generation college student status, and low socioeconomic status were not important variables in the final prediction models.
引用
收藏
页数:14
相关论文
共 40 条
[1]   Modeling student pathways in a physics bachelor's degree program [J].
Aiken, John M. ;
Henderson, Rachel ;
Caballero, Marcos D. .
PHYSICAL REVIEW PHYSICS EDUCATION RESEARCH, 2019, 15 (01)
[2]  
Altman D. G., 1991, PRACTICAL STAT MED R
[3]  
Baepler P., 2010, INT J SCHOLARSHIP TE, V4, P1, DOI [DOI 10.20429/IJSOTL.2010.040217, 10.20429/ijsotl.2010.040217]
[4]  
Baker R.S.J.d., 2009, Journal of Educational Data Mining, V1, P3, DOI [10.5281/ZENODO.3554657, DOI 10.5281/ZENODO.3554657]
[5]  
bin Mat U, 2013, 2013 IEEE 5TH INTERNATIONAL CONFERENCE ON ENGINEERING EDUCATION (ICEED): ALIGNING ENGINEERING EDUCATION WITH INDUSTRIAL NEEDS FOR NATION DEVELOPMENT, P126, DOI 10.1109/ICEED.2013.6908316
[6]  
Breiman L, 1984, Classification and Regression Trees, V1st, DOI DOI 10.1201/9781315139470
[7]   Holocarboxylase Synthetase 1 Physically Interacts with Histone H3 in Arabidopsis [J].
Chen, Xi ;
Chou, Hui-Hsien ;
Wurtele, Eve Syrkin .
SCIENTIFICA, 2013, 2013
[8]  
Cohen J., 1988, Statistical Power Analysis for the Behavioral Sciences., V2nd, DOI [DOI 10.1007/978-1-4684-5439-0_2, DOI 10.4324/9780203771587, 10.4324/9780203771587]
[9]   A Systematic Review on Educational Data Mining [J].
Dutti, Ashish ;
Ismaili, Maizatul Akmar ;
Herawani, Tutut .
IEEE ACCESS, 2017, 5 :15991-16005
[10]  
Elby E., Open Source Tutorials in Physics Sensemaking