Interpretable Machine Learning Models for PISA Results in Mathematics

被引:0
作者
Gomez-Talal, Ismael [1 ]
Bote-Curiel, Luis [1 ]
Luis Rojo-Alvarez, Jose [1 ,2 ]
机构
[1] Rey Juan Carlos Univ, Dept Signal Theory & Commun & Telematic Syst & Com, Madrid 28943, Spain
[2] DiemmaLab Ltd, Madrid 28943, Spain
来源
IEEE ACCESS | 2025年 / 13卷
关键词
Mathematical models; Biological system modeling; Education; Predictive models; Analytical models; Socioeconomics; Machine learning; Support vector machines; Data models; Navigation; Programme for international student assessment; education; interpretable machine learning; Shapley additive explanations; meta-modeling approach; dashboard visualization; CROSS-VALIDATION; PERFORMANCE; COUNTRIES;
D O I
10.1109/ACCESS.2025.3538585
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The Program for International Student Assessment (PISA) 2022 provides a global framework for assessing educational performance worldwide. We addressed a principled observation of characteristics and disparities in Mathematics performance among Spanish adolescents while showing the factors influencing these results. To address this question, we proposed using advanced Machine Learning techniques through possibly non-linear predictive models that identify key drivers of Mathematics performance to inform data-driven educational policies and interventions that improve learning outcomes. By preprocessing the PISA dataset, we categorized students into Low, Medium, and High proficiency levels and employed various binary classification models to discern predictive patterns. In addition, a stacking meta-model integrating the strengths of eight distinct predictive models was developed to enhance prediction accuracy. Our results demonstrated that the meta-model outperforms individual models in predicting student performance across various proficiency levels, consistently showing superior metrics in Precision, Recall, and Area Under Curve (AUC) scores. Specifically, the meta-model achieved an AUC score of 0.9766 when classifying students in the Low and High proficiency categories. We adopted the Shapley Additive exPlanations method to demystify the model decisions, highlighting significant predictors such as grade repetition, access to digital devices, and extracurricular Mathematics classes. We also introduced an interactive dashboard, harnessing Uniform Manifold Approximation and Projection for dimensionality reduction and enabling a granular view of the educational landscape. The intention of all this is to contribute to an education system that is well-informed and effectively adapts to the diverse needs of students.
引用
收藏
页码:27371 / 27397
页数:27
相关论文
共 65 条
  • [1] Predicting science achievement scores with machine learning algorithms: a case study of OECD PISA 2015-2018 data
    Acisli-Celik, Sibel
    Yesilkanat, Cafer Mert
    [J]. NEURAL COMPUTING & APPLICATIONS, 2023, 35 (28) : 21201 - 21228
  • [2] A Systematic Literature Review of Student' Performance Prediction Using Machine Learning Techniques
    Albreiki, Balqis
    Zaki, Nazar
    Alashwal, Hany
    [J]. EDUCATION SCIENCES, 2021, 11 (09):
  • [3] The global burden of adolescent and young adult cancer in 2019: a systematic analysis for the Global Burden of Disease Study 2019
    Alvarez, Elysia M.
    Force, Lisa M.
    Xu, Rixing
    Compton, Kelly
    Lu, Dan
    Henrikson, Hannah Jacqueline
    Kocarnik, Jonathan M.
    Harvey, James D.
    Pennini, Alyssa
    Dean, Frances E.
    Fu, Weijia
    Vargas, Martina T.
    Keegan, Theresa H. M.
    Ariffin, Hany
    Barr, Ronald D.
    Erdomaeva, Yana Arturovna
    Gunasekera, D. Sanjeeva
    John-Akinola, Yetunde O.
    Ketterl, Tyler G.
    Kutluk, Tezer
    Malogolowkin, Marcio Henrique
    Mathur, Prashant
    Radhakrishnan, Venkatraman
    Ries, Lynn Ann Gloeckler
    Rodriguez-Galindo, Carlos
    Sagoyan, Garik Barisovich
    Sultan, Iyad
    Abbasi, Behzad
    Abbasi-Kangevari, Mohsen
    Abbasi-Kangevari, Zeinab
    Abbastabar, Hedayat
    Abdelmasseh, Michael
    Abd-Elsalam, Sherief
    Abdoli, Amir
    Abebe, Haimanot
    Abedi, Aidin
    Abidi, Hassan
    Abolhassani, Hassan
    Ali, Hiwa Abubaker
    Abu-Gharbieh, Eman
    Achappa, Basavaprabhu
    Acuna, Juan Manuel
    Adedeji, Isaac Akinkunmi
    Adegboye, Oyelola A.
    Adnani, Qorinah Estiningtyas Sakilah
    Advani, Shailesh M.
    Afzal, Muhammad Sohail
    Meybodi, Mohamad Aghaie
    Ahadinezhad, Bahman
    Ahinkorah, Bright Opoku
    [J]. LANCET ONCOLOGY, 2022, 23 (01) : 27 - 52
  • [4] Uncovering student profiles. An explainable cluster analysis approach to PISA 2022
    Alvarez-Garcia, Miguel
    Arenas-Parra, Mar
    Ibar-Alonso, Raquel
    [J]. COMPUTERS & EDUCATION, 2024, 223
  • [5] [Anonymous], 2023, PISA 2022 Results (Volume I): The State of Learning and Equity in Education
  • [6] [Anonymous], 2015, Plotly
  • [7] An Analysis of PISA 2018 Mathematics Assessment for Asia-Pacific Countries Using Educational Data Mining
    Bayirli, Ezgi Gulenc
    Kaygun, Atabey
    Oz, Ersoy
    [J]. MATHEMATICS, 2023, 11 (06)
  • [8] Socioeconomic status moderates the relationship between growth mindset and learning in mathematics and science: Evidence from PISA 2018 Philippine data
    Bernardo, Allan B. I.
    [J]. INTERNATIONAL JOURNAL OF SCHOOL & EDUCATIONAL PSYCHOLOGY, 2021, 9 (02) : 208 - 222
  • [9] Bishop C. M., 2006, Pattern Recognition and Machine Learning, V4
  • [10] The influence of SES, migration background, and non-cognitive abilities on PISA reading and mathematics achievement: evidence from Sweden
    Boman, Bjorn
    Wiberg, Marie
    [J]. EUROPEAN JOURNAL OF PSYCHOLOGY OF EDUCATION, 2024, 39 (03) : 2935 - 2951