The Role of Machine Learning in Identifying Students At-Risk and Minimizing Failure

被引:6
作者
Pek, Reyhan Zeynep [1 ]
Ozyer, Sibel Tariyan [2 ]
Elhage, Tarek [3 ]
Ozyer, Tansel [2 ]
Alhajj, Reda [1 ,4 ,5 ]
机构
[1] Istanbul Medipol Univ, Dept Comp Engn, TR-34810 Istanbul, Turkiye
[2] Ankara Medipol Univ, Dept Comp Engn, TR-06050 Ankara, Turkiye
[3] ABC Private Sch, Abu Dhabi, U Arab Emirates
[4] Univ Calgary, Dept Comp Sci, Calgary, AB T2N 1N4, Canada
[5] Univ Southern Denmark, Dept Heath Informat, DK-5230 Odense, Denmark
关键词
Predictive models; Data models; Machine learning; Stacking; Machine learning algorithms; Prediction algorithms; Data mining; At-risk students; classification; dropout prediction; hybrid model; machine learning techniques; stacking ensemble model; student performance prediction; ACADEMIC-PERFORMANCE; DROPOUT PREDICTION; SCHOOL;
D O I
10.1109/ACCESS.2022.3232984
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Education is very important for students' future success. The performance of students can be supported by the extra assignments and projects given by the instructors for students with low performance. However, a major problem is that students at-risk cannot be identified early. This situation is being investigated by various researchers using Machine Learning techniques. Machine learning is used in a variety of areas and has also begun to be used to identify students at-risk early and to provide support by instructors. This research paper discusses the performance results found using Machine learning algorithms to identify at-risk students and minimize student failure. The main purpose of this project is to create a hybrid model using the ensemble stacking method and to predict at-risk students using this model. We used machine learning algorithms such as Naive Bayes, Random Forest, Decision Tree, K-Nearest Neighbors, Support Vector Machine, AdaBoost Classifier and Logistic Regression in this project. The performance of each machine learning algorithm presented in the project was measured with various metrics. Thus, the hybrid model by combining algorithms that give the best prediction results is presented in this study. The data set containing the demographic and academic information of the students was used to train and test the model. In addition, a web application developed for the effective use of the hybrid model and for obtaining prediction results is presented in the report. In the proposed method, it has been realized that stratified k-fold cross validation and hyperparameter optimization techniques increased the performance of the models. The hybrid ensemble model was tested with a combination of two different datasets to understand the importance of the data features. In first combination, the accuracy of the hybrid model was obtained as 94.8% by using both demographic and academic data. In the second combination, when only academic data was used, the accuracy of the hybrid model increased to 98.4%. This study focuses on predicting the performance of at-risk students early. Thus, teachers will be able to provide extra assistance to students with low performance.
引用
收藏
页码:1224 / 1243
页数:20
相关论文
共 53 条
[1]   Predicting at-Risk Students at Different Percentages of Course Length for Early Intervention Using Machine Learning Models [J].
Adnan, Muhammad ;
Habib, Asad ;
Ashraf, Jawad ;
Mussadiq, Shafaq ;
Raza, Arsalan Ali ;
Abid, Muhammad ;
Bashir, Maryam ;
Khan, Sana Ullah .
IEEE ACCESS, 2021, 9 :7519-7539
[2]  
Agrawal Havran., 2015, INT J ENG RES TECHNO, V4, DOI DOI 10.17577/IJERTV4IS030127
[3]   Developing an early-warning system for spotting at-risk students by using eBook interaction logs [J].
Akcapinar, Gokhan ;
Hasnine, Mohammad Nehal ;
Majumdar, Rwitajit ;
Flanagan, Brendan ;
Ogata, Hiroaki .
SMART LEARNING ENVIRONMENTS, 2019, 6 (01)
[4]  
Al-Sarem S., 2016, J ENG TECHNOL-US, V6, P304, DOI [10.21859/jet-060222, DOI 10.21859/JET-060222]
[5]   Detecting At-Risk Students With Early Interventions Using Machine Learning Techniques [J].
Al-Shabandar, Raghad ;
Hussain, Abir Jaafar ;
Liatsis, Panos ;
Keight, Robert .
IEEE ACCESS, 2019, 7 :149464-149478
[6]   Early Prediction of University Dropouts - A Random Forest Approach [J].
Behr, Andreas ;
Giese, Marco ;
Teguim, Herve D. K. ;
Theune, Katja .
JAHRBUCHER FUR NATIONALOKONOMIE UND STATISTIK, 2020, 240 (06) :743-789
[7]  
Berens J., 2018, CESIFO WORKING PAPER, V11, P1, DOI DOI 10.5281/ZENODO.3594771
[8]   Data mining for modeling students' performance: A tutoring action plan to prevent academic dropout [J].
Burgos, Concepcion ;
Campanario, Maria L. ;
de la Pena, David ;
Lara, Juan A. ;
Lizcano, David ;
Martinez, Maria A. .
COMPUTERS & ELECTRICAL ENGINEERING, 2018, 66 :541-556
[9]   Predicting Students Success in Blended Learning-Evaluating Different Interactions Inside Learning Management Systems [J].
Buschetto Macarini, Luiz Antonio ;
Cechinel, Cristian ;
Batista Machado, Matheus Francisco ;
Culmant Ramos, Vinicius Faria ;
Munoz, Roberto .
APPLIED SCIENCES-BASEL, 2019, 9 (24)
[10]   Interpretable Multiview Early Warning System Adapted to Underrepresented Student Populations [J].
Cano, Alberto ;
Leonard, John D. .
IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, 2019, 12 (02) :198-211