Comparative Analysis of Prediction Techniques to Determine Student Dropout: Logistic Regression vs Decision Trees

被引:0
作者
Perez, Alfredo [1 ]
Grandon, Elizabeth E. [1 ]
Caniupan, Monica [1 ]
Vargas, Gilda [2 ]
机构
[1] Univ Bio Bio, Dept Sistemas Informac, Concepcion, Chile
[2] Univ Bio Bio, Dept Estadist, Concepcion, Chile
来源
2018 37TH INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY (SCCC) | 2018年
关键词
Student Dropout; Data Mining; SAP; Predictive Analytics; Logistic Regression; Decision Trees; HIGHER EDUCATION;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Currently, the detection of students who may drop out from an academic program is a relevant issue for universities, so there are efforts to examine the variables that determine students' drop out. Drop out is defined in different ways, however, all the studies converge in that for a student to drop out a course of study, some variables must be combined. This study presents a comparison of performance indicators of the current drop out model of the Universidad del Bio-Bio (UBB), which is based on logistic regression technique and it is compared with a new model based on decision trees. The new model is obtained through data mining methodologies and it was implemented through the SAP Predictive Analytics tool. To train, validate, and apply the model, real data from the UBB databases were used. The comparison shows that the prediction of student' drop out of the proposed model obtains an accuracy of 86%, a precision of 97% with an error rate of 14%, better indicators than the current values delivered by the model based on logistic regression. Subsequently, the prediction model obtained was optimized considering other variables, improving even more the prediction indicators. Higher education institutions should take into account the variables that explain the most the phenomenon of student's drop out to improve the retention of their students.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Comparison of logistic regression and machine learning techniques in prediction of habitat distribution of plant species
    Sahragard, Hossein Piri
    Chahouki, Mohammad Ali Zare
    RANGE MANAGEMENT AND AGROFORESTRY, 2016, 37 (01) : 21 - 26
  • [42] Combining Logistic Regression Analysis with Data Mining Techniques to Predict Diabetes
    Paisanwarakiat, Ratchaneewan
    Na-udom, Anamai
    Rungrattanaubol, Jaratsri
    PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON COMPUTING AND INFORMATION TECHNOLOGY (IC2IT 2022), 2022, 453 : 88 - 98
  • [43] Student Performance Prediction with Decision Tree Ensembles and Feature Selection Techniques
    Ahmad, Amir
    Ray, Santosh
    Khan, Md. Tabrej
    Nawaz, Ali
    JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2025,
  • [44] Predictive model to reduce the dropout rate of university students in Peru: Bayesian Networks vs. Decision Trees
    Cevallos Medina, Erik
    Barahona Chunga, Claudio
    Armas-Aguirre, Jimmy
    Grandon, Elizabeth E.
    2020 15TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI'2020), 2020,
  • [45] A new synthesis analysis method for building logistic regression prediction models
    Sheng, Elisa
    Zhou, Xiao Hua
    Chen, Hua
    Hu, Guizhou
    Duncan, Ashlee
    STATISTICS IN MEDICINE, 2014, 33 (15) : 2567 - 2576
  • [46] DISCRIMINANT ANALYSIS AND LOGISTIC REGRESSION IN PREDICTING BUSINESS FAILURE: A COMPARATIVE STUDY
    Garcia-Gallego, Ana
    Mures-Quintana, Maria-Jesus
    Vallejo-Pascual, M. Eva
    5TH ANNUAL EUROMED CONFERENCE OF THE EUROMED ACADEMY OF BUSINESS: BUILDING NEW BUSINESS MODELS FOR SUCCESS THROUGH COMPETITIVENESS AND RESPONSIBILITY, 2013, : 1759 - 1762
  • [47] Prediction of periventricular leukomalacia. Part I: Selection of hemodynamic features using logistic regression and decision tree algorithms
    Samanta, Biswanath
    Bird, Geoffrey L.
    Kuijpers, Marijn
    Zimmerman, Robert A.
    Jarvik, Gail P.
    Wernovsky, Gil
    Clancy, Robert R.
    Licht, Daniel J.
    Gaynor, J. William
    Nataraj, Chandrasekhar
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2009, 46 (03) : 201 - 215
  • [48] Comparing the Efficiency of Heart Disease Prediction using Novel Random Forest, Logistic Regression and Decision Tree And SVM Algorithms
    Teja, P. Prasanna Sai
    Veeramani, T.
    CARDIOMETRY, 2022, (25): : 1491 - 1499
  • [49] Gradient boosted trees with individual explanations: An alternative to logistic regression for viability prediction in the first trimester of pregnancy
    Vaulet, Thibaut
    Al-Memar, Maya
    Fourie, Hanine
    Bobdiwala, Shabnam
    Saso, Srdjan
    Pipi, Maria
    Stalder, Catriona
    Bennett, Phillip
    Timmerman, Dirk
    Bourne, Tom
    De Moor, Bart
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2022, 213
  • [50] STUDENT DROP OUT FACTOR ANALYSIS AND TREND PREDICTION USING DECISION TREE
    Chareonrat, Jeeranan
    SURANAREE JOURNAL OF SCIENCE AND TECHNOLOGY, 2016, 23 (02): : 187 - 193