Predicting science achievement scores with machine learning algorithms: a case study of OECD PISA 2015-2018 data

被引:9
作者
Acisli-Celik, Sibel [1 ]
Yesilkanat, Cafer Mert [1 ]
机构
[1] Artvin Coruh Univ, Sci Teaching Dept, Artvin, Turkiye
关键词
Artificial intelligence; Interdisciplinary; transdisciplinary; Random forest; XGBoost; SUPPORT VECTOR REGRESSION; RANDOM SUBSPACE METHOD; EDUCATION POLICY; RANDOM FORESTS; STUDENTS; PERFORMANCE; ASSESSMENTS; VARIABLES; LITERACY; MODELS;
D O I
10.1007/s00521-023-08901-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this study, the performance of machine learning methods was examined in terms of predicting the science education achievement scores of the students who took the exam for the next term, PISA 2018, and the science average scores of the countries, using PISA 2015 data. The research sample consists of a total of 67,329 students who took the PISA 2015 exam from 13 randomly selected countries (Brazil, Chinese Taipei, Dominican Republic, Estonia, Finland, Hungary, Italy, Japan, Lithuania, Luxembourg, Peru, Singapore, Turkiye). In this study, multiple linear regression, support vector regression, random forest, and extreme gradient boosting (XGBoost) machine learning algorithms were used. For the machine learning process, a randomly determined part from the PISA-2015 data of each country researched was divided as training data and the remaining part as testing data to evaluate model performance. As a result of the research, it was determined that the XGBoost algorithm showed the best performance in estimating both PISA-2015 test data and PISA-2018 science academic achievement scores in all researched countries. Furthermore, it was determined that the highest PISA-2018 science achievement scores of the students who participated in the exam, estimated by this algorithm, were in Luxembourg (r = 0.600, RMSE = 75.06, MAE = 59.97), while the lowest were in Finland (r = 0.467, RMSE = 79.38, MAE = 63.24). In addition, the average PISA-2018 science scores of the countries were estimated with the XGBoost algorithm, and the average science scores calculated for all the countries studied were estimated with very high accuracy.
引用
收藏
页码:21201 / 21228
页数:28
相关论文
共 94 条
  • [1] Acar T, 2012, EGIT BILIM, V37, P178
  • [2] Machine learning and data analytics for the IoT
    Adi, Erwin
    Anwar, Adnan
    Baig, Zubair
    Zeadally, Sherali
    [J]. NEURAL COMPUTING & APPLICATIONS, 2020, 32 (20) : 16205 - 16233
  • [3] Aksu G., 2018, ANK UNIV EGIT BILIM, V51, P71, DOI DOI 10.30964/AUEBFD.464262
  • [4] Aksu G, 2018, THESIS HACETTEPE U A
  • [5] Aksu N, 2019, DISSERTATION
  • [6] Investigation of Factors Associated with Science Literacy Performance of Students by Hierarchical Linear Modeling: PISA 2015 Comparison of Turkey and Singapore
    Alatli, Betul Karakoc
    [J]. EGITIM VE BILIM-EDUCATION AND SCIENCE, 2020, 45 (202): : 17 - 49
  • [7] Anagün SS, 2011, EGIT BILIM, V36, P84
  • [8] Anil D, 2009, EGIT BILIM, V34, P87
  • [9] Antonelli-Ponti Mayra, 2021, Psico-USF, V26, P13, DOI 10.1590/1413-82712021260102
  • [10] Araujo L, 2017, INT J COMP EDUC DEV, V19, P20, DOI 10.1108/IJCED-12-2016-0023