A Scalable Machine Learning-based Ensemble Approach to Enhance the Prediction Accuracy for Identifying Students at-Risk

被引:0
作者
Verma, Swati [1 ]
Yadav, Rakesh Kumar [1 ]
Kholiya, Kuldeep [2 ]
机构
[1] IFTM Univ Moradabad, Moradabad, Uttar Pradesh, India
[2] BT Kumaon Inst Technol, Dwaraht, Uttaranchal, India
关键词
Educational data mining; resampling methods; feature selection technique; machine learning; imbalanced data; PERFORMANCE; FAILURE; SMOTE;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
the educational data mining problems, the early prediction of the students' academic performance is the most important task, so that timely and requisite support may be provided to the needy students. Machine learning techniques may be used as an important tool for predicting low-performers in educational institutions. In the present paper, five single-supervised machine learning techniques have been used, including Decision Tree, Naive Bayes, k-Nearest-Neighbor, Support Vector Machine, and Logistic Regression. To analyze the effect of an imbalanced dataset, the performance of these algorithms has been checked with and without various resampling methods such as Synthetic Minority Oversampling Technique (SMOTE), Borderline SMOTE, SVM-SMOTE, and Adaptive Synthetic (ADASYN). The Random hold-out method and GridSearchCV were used as model validation techniques and hyper-parameter tuning respectively. The results of the present study indicated that Logistic Regression is the best performing classifier with every balanced dataset generated using all of the four resampling techniques and also achieved the highest accuracy of 94.54% with SMOTE. Furthermore, to improve the prediction results and to make the model scalable, the most suitable classifier was integrated with the help of bagging, and a well-accepted accuracy of 95.45% was achieved.
引用
收藏
页码:185 / 192
页数:8
相关论文
共 38 条
  • [1] A Systematic Literature Review of Student' Performance Prediction Using Machine Learning Techniques
    Albreiki, Balqis
    Zaki, Nazar
    Alashwal, Hany
    [J]. EDUCATION SCIENCES, 2021, 11 (09):
  • [2] [Anonymous], 2014, World Journal of Computer Application and Technology
  • [3] Ashraf Mudasir, 2018, Procedia Computer Science, V132, P1021, DOI 10.1016/j.procs.2018.05.018
  • [4] An Intelligent Prediction System for Educational Data Mining Based on Ensemble and Filtering approaches
    Ashraf, Mudasir
    Zaman, Majid
    Ahmed, Muheet
    [J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND DATA SCIENCE, 2020, 167 : 1471 - 1483
  • [5] Analyzing undergraduate students' performance using educational data mining
    Asif, Raheela
    Merceron, Agathe
    Ali, Syed Abbas
    Haider, Najmi Ghani
    [J]. COMPUTERS & EDUCATION, 2017, 113 : 177 - 194
  • [6] Enhancing the prediction of student performance based on the machine learning XGBoost algorithm
    Asselman, Amal
    Khaldi, Mohamed
    Aammou, Souhaib
    [J]. INTERACTIVE LEARNING ENVIRONMENTS, 2023, 31 (06) : 3360 - 3379
  • [7] Baker R., 2010, INT ENCY ED, V7, P112
  • [8] Educational Data Mining: An Advance for Intelligent Systems in Education
    Baker, Ryan S.
    [J]. IEEE INTELLIGENT SYSTEMS, 2014, 29 (03) : 78 - 82
  • [9] Begum S., 2022, INT J INTELLIGENT EN, V15, P316
  • [10] SMOTE: Synthetic minority over-sampling technique
    Chawla, Nitesh V.
    Bowyer, Kevin W.
    Hall, Lawrence O.
    Kegelmeyer, W. Philip
    [J]. 2002, American Association for Artificial Intelligence (16)