Empowering education: Harnessing ensemble machine learning approach and ACO-DT classifier for early student academic performance prediction

被引:0
|
作者
Mahawar, Kajal [1 ]
Rattan, Punam [1 ]
机构
[1] Lovely Profess Univ, Sch Comp Applicat, Phagwara, Punjab, India
关键词
Machine learning; Students' performance; Multivariate ensemble model; ML classifiers; Feature selection; Ant Colony Optimization;
D O I
10.1007/s10639-024-12976-6
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Higher education institutions have consistently strived to provide students with top-notch education. To achieve better outcomes, machine learning (ML) algorithms greatly simplify the prediction process. ML can be utilized by academicians to obtain insight into student data and mine data for forecasting the performance. In this paper, the authors proposed an ML-based student prediction model based on the demographic, social, psychological, and economic factors, collectively. The dataset utilized for this study was compiled from a designed questionnaire administered to second-year undergraduate students. The objective of this study is to uncover factors that could assist in predicting students' performance. Eight ML classifiers, logistic regression, random forest, support vector machine, XGBoost, support vector machine with a linear kernel, na & iuml;ve Bayes, K-Nearest Neighbor, and decision tree are used to forecast student performance. Additionally, nine feature selection techniques, variance threshold, XGBoost, feature importance, recursive feature elimination, chi-square, ridge, Pearson correlation, lasso, and random forest, are employed to determine optimal factors. The authors experimented with each technique by creating two sets of training and testing data with 80:20 and 70:30 proportions, respectively. Comparatively, the ensemble DXK (DT + XGB + KNN) model with cross-validation and 80:20 proportions outperformed other standard classifiers, achieving a highest accuracy of 97.83%, an r-square of 96.17%, a precision of 97.94%, a recall of 97.83%, and an f1-score of 97.88%. These were the highest among all models tested. Additionally, the authors propose the ACO-DT model, which improves the prediction performance of the top-performing DT classifier by utilizing the Ant Colony Optimization technique. The findings demonstrate that the proposed model with 80:20 proportions achieve an accuracy of 98.15%, an f1-score of 98.16%, a precision of 98.18%, a recall of 98.15%, and an r-square of 84.75%, surpassing all other models for forecasting student performance. Using the specified data size, this model creation time is 8.49 s. The authors also recommended the future research directions to further enhance this study.
引用
收藏
页码:4639 / 4667
页数:29
相关论文
共 30 条
  • [21] Early prediction of undergraduate Student's academic performance in completely online learning: A five-year study
    Bravo-Agapito, Javier
    Romero, Sonia J.
    Pamplona, Sonia
    COMPUTERS IN HUMAN BEHAVIOR, 2021, 115
  • [22] New Approach to Enhancing Student Performance Prediction Using Machine Learning Techniques and Clickstream Data in Virtual Learning Environments
    Zakaria Khoudi
    Nasereddine Hafidi
    Mourad Nachaoui
    Soufiane Lyaqini
    SN Computer Science, 6 (2)
  • [23] Application of machine learning in higher education to assess student academic performance, at-risk, and attrition: A meta-analysis of literature
    Kiran Fahd
    Sitalakshmi Venkatraman
    Shah J. Miah
    Khandakar Ahmed
    Education and Information Technologies, 2022, 27 : 3743 - 3775
  • [24] Application of machine learning in higher education to assess student academic performance, at-risk, and attrition: A meta-analysis of literature
    Fahd, Kiran
    Venkatraman, Sitalakshmi
    Miah, Shah J.
    Ahmed, Khandakar
    EDUCATION AND INFORMATION TECHNOLOGIES, 2022, 27 (03) : 3743 - 3775
  • [25] Early Prediction of Physical Performance in Elite Soccer Matches-A Machine Learning Approach to Support Substitutions
    Dijkhuis, Talko B.
    Kempe, Matthias
    Lemmink, Koen A. P. M.
    ENTROPY, 2021, 23 (08)
  • [26] Enhancing Prediction Performance of Landslide Susceptibility Model Using Hybrid Machine Learning Approach of Bagging Ensemble and Logistic Model Tree
    Xuan Luan Truong
    Mitamura, Muneki
    Kono, Yasuyuki
    Raghavan, Venkatesh
    Yonezawa, Go
    Xuan Quang Truong
    Thi Hang Do
    Dieu Tien Bui
    Lee, Saro
    APPLIED SCIENCES-BASEL, 2018, 8 (07):
  • [27] A novel ensemble modeling approach for the spatial prediction of tropical forest fire susceptibility using LogitBoost machine learning classifier and multi-source geospatial data
    Mahyat Shafapour Tehrany
    Simon Jones
    Farzin Shabani
    Francisco Martínez-Álvarez
    Dieu Tien Bui
    Theoretical and Applied Climatology, 2019, 137 : 637 - 653
  • [28] A novel ensemble modeling approach for the spatial prediction of tropical forest fire susceptibility using LogitBoost machine learning classifier and multi-source geospatial data
    Tehrany, Mahyat Shafapour
    Jones, Simon
    Shabani, Farzin
    Martinez-Alvarez, Francisco
    Dieu Tien Bui
    THEORETICAL AND APPLIED CLIMATOLOGY, 2019, 137 (1-2) : 637 - 653
  • [29] Blending E-Learning with Hands-on Laboratory Instruction in Engineering Education An Experimental Study on Early Prediction of Student Performance and Behavior
    Charitopoulos A.
    Rangoussi M.
    Koulouriotis D.
    International Journal of Emerging Technologies in Learning, 2022, 17 (20) : 213 - 230
  • [30] Early student dropout detection in Indian secondary education with special reference to selected districts in Tamil Nadu: a machine learning-based survival analysis approach
    Venkatesan, Raghul Gandhi
    Mappillairaju, Bagavandas
    JOURNAL OF COMPUTATIONAL SOCIAL SCIENCE, 2024, 7 (03): : 2309 - 2331