A Scalable Machine Learning-based Ensemble Approach to Enhance the Prediction Accuracy for Identifying Students at-Risk

被引:0
|
作者
Verma, Swati [1 ]
Yadav, Rakesh Kumar [1 ]
Kholiya, Kuldeep [2 ]
机构
[1] IFTM Univ Moradabad, Moradabad, Uttar Pradesh, India
[2] BT Kumaon Inst Technol, Dwaraht, Uttaranchal, India
关键词
Educational data mining; resampling methods; feature selection technique; machine learning; imbalanced data; PERFORMANCE; FAILURE; SMOTE;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
the educational data mining problems, the early prediction of the students' academic performance is the most important task, so that timely and requisite support may be provided to the needy students. Machine learning techniques may be used as an important tool for predicting low-performers in educational institutions. In the present paper, five single-supervised machine learning techniques have been used, including Decision Tree, Naive Bayes, k-Nearest-Neighbor, Support Vector Machine, and Logistic Regression. To analyze the effect of an imbalanced dataset, the performance of these algorithms has been checked with and without various resampling methods such as Synthetic Minority Oversampling Technique (SMOTE), Borderline SMOTE, SVM-SMOTE, and Adaptive Synthetic (ADASYN). The Random hold-out method and GridSearchCV were used as model validation techniques and hyper-parameter tuning respectively. The results of the present study indicated that Logistic Regression is the best performing classifier with every balanced dataset generated using all of the four resampling techniques and also achieved the highest accuracy of 94.54% with SMOTE. Furthermore, to improve the prediction results and to make the model scalable, the most suitable classifier was integrated with the help of bagging, and a well-accepted accuracy of 95.45% was achieved.
引用
收藏
页码:185 / 192
页数:8
相关论文
共 50 条
  • [1] Identifying Students At-Risk with an Ensemble of Machine Learning Algorithms
    Soobramoney, Ranjin
    Singh, Alveen
    2019 CONFERENCE ON INFORMATION COMMUNICATIONS TECHNOLOGY AND SOCIETY (ICTAS), 2019,
  • [2] Identifying At-Risk Students for Early Intervention-A Probabilistic Machine Learning Approach
    Nimy, Eli
    Mosia, Moeketsi
    Chibaya, Colin
    APPLIED SCIENCES-BASEL, 2023, 13 (06):
  • [3] The Role of Machine Learning in Identifying Students At-Risk and Minimizing Failure
    Pek, Reyhan Zeynep
    Ozyer, Sibel Tariyan
    Elhage, Tarek
    Ozyer, Tansel
    Alhajj, Reda
    IEEE ACCESS, 2023, 11 : 1224 - 1243
  • [4] Identifying at-risk students based on the phased prediction model
    Yan Chen
    Qinghua Zheng
    Shuguang Ji
    Feng Tian
    Haiping Zhu
    Min Liu
    Knowledge and Information Systems, 2020, 62 : 987 - 1003
  • [5] Identifying at-risk students based on the phased prediction model
    Chen, Yan
    Zheng, Qinghua
    Ji, Shuguang
    Tian, Feng
    Zhu, Haiping
    Liu, Min
    KNOWLEDGE AND INFORMATION SYSTEMS, 2020, 62 (03) : 987 - 1003
  • [6] Developing an Explainable Machine Learning-Based Personalised Dementia Risk Prediction Model: A Transfer Learning Approach With Ensemble Learning Algorithms
    Danso, Samuel O.
    Zeng, Zhanhang
    Muniz-Terrera, Graciela
    Ritchie, Craig W.
    FRONTIERS IN BIG DATA, 2021, 4
  • [7] Improving the Accuracy of Oncology Diagnosis: A Machine Learning-Based Approach to Cancer Prediction
    Cabanillas-Carbonell, Michael
    Zapata-Paulini, Joselyn
    INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2024, 20 (11) : 102 - 122
  • [8] SecRiskAI: a Machine Learning-Based Approach for Cybersecurity Risk Prediction in Businesses
    Franco, Muriel F.
    Sula, Erion
    Huertas, Alberto
    Scheid, Eder J.
    Granville, Lisandro Z.
    Stiller, Burkhard
    2022 IEEE 24TH CONFERENCE ON BUSINESS INFORMATICS (CBI 2022), VOL 1, 2022, : 1 - 10
  • [9] An Ensemble Machine Learning and Data Mining Approach to Enhance Stroke Prediction
    Wijaya, Richard
    Saeed, Faisal
    Samimi, Parnia
    Albarrak, Abdullah M.
    Qasem, Sultan Noman
    BIOENGINEERING-BASEL, 2024, 11 (07):
  • [10] Prediction of software quality with Machine Learning-Based ensemble methods
    Ceran A.A.
    Ar Y.
    Tanrıöver Ö.Ö.
    Seyrek Ceran S.
    Materials Today: Proceedings, 2023, 81 : 18 - 25