A Learning Analytics Approach to Identify Students at Risk of Dropout: A Case Study with a Technical Distance Education Course

被引:40
作者
Queiroga, Emanuel Marques [1 ,2 ]
Lopes, Joao Ladislau [2 ]
Kappel, Kristofer [1 ]
Aguiar, Marilton [1 ]
Araujo, Ricardo Matsumura [1 ]
Munoz, Roberto [3 ]
Villarroel, Rodolfo [4 ]
Cechinel, Cristian [5 ]
机构
[1] Univ Fed Pelotas UFPel, Ctr Desenvolvimento Tecnol CDTEC, BR-96010610 Pelotas, RS, Brazil
[2] Inst Fed Educ Ciencia & Tecnol Rio Grandense IFSu, BR-96015560 Pelotas, RS, Brazil
[3] Univ Valparaiso, Escuela Ingn Informat, Valparaiso 2362735, Chile
[4] Pontificia Univ Catolica Valparaiso, Escuela Ingn Informat, Valparaiso 2362807, Chile
[5] Univ Fed Santa Catarina UFSC, Ctr Ciencias Tecnol Saude CTS, BR-88906072 Ararangua, Brazil
来源
APPLIED SCIENCES-BASEL | 2020年 / 10卷 / 11期
关键词
at-risk students; genetic algorithm; learning analytics; educational data mining; PREDICTION;
D O I
10.3390/app10113998
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Contemporary education is a vast field that is concerned with the performance of education systems. In a formal e-learning context, student dropout is considered one of the main problems and has received much attention from the learning analytics research community, which has reported several approaches to the development of models for the early prediction of at-risk students. However, maximizing the results obtained by predictions is a considerable challenge. In this work, we developed a solution using only students' interactions with the virtual learning environment and its derivative features for early predict at-risk students in a Brazilian distance technical high school course that is 103 weeks in duration. To maximize results, we developed an elitist genetic algorithm based on Darwin's theory of natural selection for hyperparameter tuning. With the application of the proposed technique, we predicted the student at risk with an Area Under the Receiver Operating Characteristic Curve (AUROC) above 0.75 in the initial weeks of a course. The results demonstrate the viability of applying interaction count and derivative features to generate prediction models in contexts where access to demographic data is restricted. The application of a genetic algorithm to the tuning of hyperparameters classifiers can increase their performance in comparison with other techniques.
引用
收藏
页数:20
相关论文
共 54 条
[1]   Prediction of Student's performance by modelling small dataset size [J].
Abu Zohair, Lubna Mahmoud .
INTERNATIONAL JOURNAL OF EDUCATIONAL TECHNOLOGY IN HIGHER EDUCATION, 2019, 16 (01)
[2]  
[Anonymous], 2017, Practical Statistics for Data Scientists, 50 Essential Concepts
[3]  
[Anonymous], 2015, REV BRASILEIRA INFOR
[4]  
[Anonymous], 2017, ARXIV170206404
[5]  
[Anonymous], 2014, European Journal of Open, Distance and E-learning
[6]  
[Anonymous], 2019, Benchmarking Higher Education System Performance, DOI DOI 10.1787/BE5514D7-EN
[7]  
Baker R, 2014, CAMBRIDGE HANDBOOK OF THE LEARNING SCIENCES, 2ND EDITION, P253
[8]   Learner support in MOOCs: Identifying variables linked to completion [J].
Barbera Gregori, Elena ;
Zhang, Jingjing ;
Galvan-Fernandez, Cristina ;
de Asis Fernandez-Navarro, Francisco .
COMPUTERS & EDUCATION, 2018, 122 :153-168
[9]  
Bergstra J., 2011, Advances in Neural Information Processing Systems, V24, P2546
[10]   Predicting Students Success in Blended Learning-Evaluating Different Interactions Inside Learning Management Systems [J].
Buschetto Macarini, Luiz Antonio ;
Cechinel, Cristian ;
Batista Machado, Matheus Francisco ;
Culmant Ramos, Vinicius Faria ;
Munoz, Roberto .
APPLIED SCIENCES-BASEL, 2019, 9 (24)