Dropout prediction model in MOOC based on clickstream data and student sample weight

被引:0
作者
Cong Jin
机构
[1] Central China Normal University,School of Computer
来源
Soft Computing | 2021年 / 25卷
关键词
MOOC; Dropout prediction; Initial weight calculation; Intelligent optimization; Clickstream data;
D O I
暂无
中图分类号
学科分类号
摘要
Currently, the high dropout rate of massive open online course (MOOC) has seriously affected its popularity and promotion. How to effectively predict the dropout status of students in MOOC so as to intervene as early as possible has become a hot topic. As we know, different students in MOOC have big differences in learning behaviors, learning habits, and learning time, etc. This leads to different student samples having different effects on the prediction performance of the machine learning-based dropout prediction model (DPM). This is because the performance of machine learning-based classifiers heavily depends on the quality of training samples. To solve this problem, in this paper, a new DPM based on machine learning is proposed. Since the traditional neighborhood concept has nothing to do with the label of the sample, a new neighborhood definition, i.e., the max neighborhood, is first given. It is not only related to the distance between samples, but also related to the labels of the samples. Then, the calculation and realization algorithm of the initial weight of each student sample is studied based on the definition of the max neighborhood, which is different from the commonly methods of randomly selecting initial values. Next, the optimization method of the initial weight of the student sample is further studied using the intelligent optimization method. Finally, the classifiers trained by the weighted training samples are used as DPM. Experimental results of direct observation and statistical testing on public data sets indicate that the training sample weighting and intelligent optimization technology can significantly improve the predictive performance of DPM.
引用
收藏
页码:8971 / 8988
页数:17
相关论文
共 61 条
[1]  
Adolfo JUZ(2021)Variable neighborhood search to solve the generalized orienteering problem Int Trans Oper Res 28 142-167
[2]  
Gregorio T(2013)ROC curve equivalence using the Kolmogorov-Smirnov test Pattern Recogn Lett 34 470-475
[3]  
Alfonso M(2019)MOOC dropout prediction using a hybrid algorithm based on decision tree and extreme learning machine Math Probl Eng 31 625-631
[4]  
Bradley AP(2014)Stock price prediction based on SSA and SVM Procedia Comput Sci 12 309-313
[5]  
Chen J(2015)Feature selection based on hybridization of genetic algorithm and particle swarm optimization IEEE Geosci Remote Sens Lett 42 741-750
[6]  
Feng J(2015)Credit scoring using the clustered support vector machine Expert Syst Appl 35 1593-1606
[7]  
Sun X(2018)Diagnosis of learner dropout based on learning styles for online distance learning Telematics Inform 93 71-91
[8]  
Fenghua WEN(2013)ROC curves in cost space Mach Learn 5 45-55
[9]  
Jihong X(2017)Identifying at-risk students for early interventions-a time-series clustering approach IEEE Trans Emerg Top Comput 8 497-510
[10]  
Zhifang HE(2017)Improved student dropout prediction in Thai University using ensemble of mixed-type data clusterings Int J Mach Learn Cybern 25 1955-1962