Predicting student academic performance using multi-model heterogeneous ensemble approach

被引:58
作者
Adejo, Olugbenga Wilson [1 ]
Connolly, Thomas [2 ]
机构
[1] Univ West Scotland, Sch Engn & Comp, Paisley, Renfrew, Scotland
[2] Univ West Scotland, Paisley, Renfrew, Scotland
关键词
SVM; ANN; Higher education; Performance prediction; DT; Ensemble model; Learners' performance; Education data mining;
D O I
10.1108/JARHE-09-2017-0113
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Purpose - The purpose of this paper is to empirically investigate and compare the use of multiple data sources, different classifiers and ensembles of classifiers technique in predicting student academic performance. The study will compare the performance and efficiency of ensemble techniques that make use of different combination of data sources with that of base classifiers with single data source. Design/methodology/approach - Using a quantitative research methodology, data samples of 141 learners enrolled in the University of the West of Scotland were extracted from the institution's databases and also collected through survey questionnaire. The research focused on three data sources: student record system, learning management system and survey, and also used three state-of-art data mining classifiers, namely, decision tree, artificial neural network and support vector machine for the modeling. In addition, the ensembles of these base classifiers were used in the student performance prediction and the performances of the seven different models developed were compared using six different evaluation metrics. Findings - The results show that the approach of using multiple data sources along with heterogeneous ensemble techniques is very efficient and accurate in prediction of student performance as well as help in proper identification of student at risk of attrition. Practical implications - The approach proposed in this study will help the educational administrators and policy makers working within educational sector in the development of new policies and curriculum on higher education that are relevant to student retention. In addition, the general implications of this research to practice is its ability to accurately help in early identification of students at risk of dropping out of HE from the combination of data sources so that necessary support and intervention can be provided. Originality/value - The research empirically investigated and compared the performance accuracy and efficiency of single classifiers and ensemble of classifiers that make use of single and multiple data sources. The study has developed a novel hybrid model that can be used for predicting student performance that is high in accuracy and efficient in performance. Generally, this research study advances the understanding of the application of ensemble techniques to predicting student performance using learner data and has successfully addressed these fundamental questions: What combination of variables will accurately predict student academic performance? What is the potential of the use of stacking ensemble techniques in accurately predicting student academic performance?
引用
收藏
页码:61 / 75
页数:15
相关论文
共 45 条
  • [1] Adejo Olugbenga, 2017, International Journal of Computer Science & Information Technology, V9, P149, DOI 10.5121/ijcsit.2017.93013
  • [2] Can we predict success from log data in VLEs? Classification of interactions for learning analytics and their relation with performance in VLE-supported F2F and online learning
    Agudo-Peregrina, Angel F.
    Iglesias-Pradas, Santiago
    Angel Conde-Gonzalez, Miguel
    Hernandez-Garcia, Angel
    [J]. COMPUTERS IN HUMAN BEHAVIOR, 2014, 31 : 542 - 550
  • [3] [Anonymous], 2015, METHODOL CHALL RES S
  • [4] Bahler D., 2000, 17 NAT C ART INT AAA
  • [5] Predicting Student Performance in Higher Education
    Bydzovska, Hana
    Popelinsky, Lubomir
    [J]. 2013 24TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA 2013), 2013, : 141 - 145
  • [6] Calvo-Flores MD., 2006, CURR DEV TECHNOL ASS, V1, P586
  • [7] Students' LMS interaction patterns and their relationship with achievement: A case study in higher education
    Cerezo, Rebeca
    Sanchez-Santillan, Miguel
    Puerto Paule-Ruiz, M.
    Carlos Nunez, J.
    [J]. COMPUTERS & EDUCATION, 2016, 96 : 42 - 54
  • [8] Improving Quality of Educational Processes Providing New Knowledge using Data Mining Techniques
    Chalaris, Manolis
    Gritzalis, Stefanos
    Maragoudakis, Manolis
    Sgouropoulou, Cleo
    Tsolakidis, Anastasios
    [J]. 3RD INTERNATIONAL CONFERENCE ON INTEGRATED INFORMATION (IC-ININFO), 2014, 147 : 390 - 397
  • [9] Data mining: An overview from a database perspective
    Chen, MS
    Han, JW
    Yu, PS
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1996, 8 (06) : 866 - 883
  • [10] Conijn Rianne, 2017, IEEE Transactions on Learning Technologies, V10, P17, DOI 10.1109/TLT.2016.2616312