Application of logistic regression to predict the failure of students in subjects of a mathematics undergraduate course

被引:0
作者
Stella F. Costa
Michael M. Diniz
机构
[1] Instituto Federal de Educação,
[2] Ciência e Tecnologia de São Paulo - IFSP,undefined
来源
Education and Information Technologies | 2022年 / 27卷
关键词
Educational data mining; Logistic regression; Student performance; Data analysis;
D O I
暂无
中图分类号
学科分类号
摘要
The large rates of students’ failure is a very frequent problem in undergraduate courses, being even more evident in exact sciences. Pointing out the reasons of such problem is a paramount research topic, though not an easy task. An alternative is to use Educational Data Mining techniques (EDM), which enables one to convert data from educational database into useful information, in order to understand and improve teaching and learning processes. In this way, the objective of this paper is to propose mathematical models based on EDM techniques to estimate the probability of a student in a mathematics degree course at IFSP (Federal Institute of São Paulo) to fail in exact sciences disciplines, and later on, indicate which aspects contribute significantly for the Students’ failure rates in these branches. We present three logistic regression models that which were applied based on socioeconomic data and student performance over 4 years. For interpretation and evaluation of such models, odds ratio, ten-fold Cross Validation method and the metrics: accuracy, sensitivity, specificity and area under the ROC curve (AUC) were used. It was noted that through Cross Validation, the models achieved accuracy values accounting for over 70%, sensitivity over 70%, specificity over 60% and AUC over 0.75. Analyzing the predictive variables of these models, we identified that factors such as advantage age, rates of failure through the course and attendance in initial semesters can increase the probability of failure in exact science disciplines in the analyzed course.
引用
收藏
页码:12381 / 12397
页数:16
相关论文
共 37 条
  • [1] Abu Saa A.(2019)Factors affecting students’ performance in higher education: A systematic review of predictive data mining techniques Technology, Knowledge and Learning 24 567-598
  • [2] Al-Emran M.(2013)Parental background and university dropout in Italy Parental background and university dropout in italy Higher Education 65 437-456
  • [3] Shaalan K.(2017)Analyzing undergraduate students’ performance using educational data mining Computers & Education 113 177-194
  • [4] Aina C(2011)Data mining: A prediction for performance improvement using classification (IJCSIS) The International Journal of Computer Science and Information Security 9 136-140
  • [5] Asif R(1997)The use of the area under the ROC curve in the evaluation of machine learning algorithms Pattern Recognition 30 145-1159
  • [6] Merceron A(2002)A temporal investigation of factors related to timely degree completion The Journal of Higher Education 73 555-581
  • [7] Ali SA(2010)A logistic approach to predicting student success in online database courses American Journal of Business Education (AJBE) 3 1-6
  • [8] Haider NG(1994)A simple model for predicting success in an engineering programme International Journal of Engineering Education 10 268-268
  • [9] Bhardwaj BK(2007)Skills, learning styles and success of first-year undergraduates Active Learning in Higher Education 8 259-273
  • [10] Pal S(2015)Fatores determinantes para o sucesso na disciplina de cálculo diferencial e integral aplicando a regressão logística Revista de Ensino de Ciências e Engenharia 6 122-141