Research on learning behavior patterns from the perspective of educational data mining: Evaluation, prediction and visualization

被引:18
作者
Feng, Guiyun [1 ]
Fan, Muwei [1 ]
机构
[1] Guizhou Univ, Sch Management, Guiyang 550025, Peoples R China
关键词
Educational data mining; Learning behavior patterns; Evaluation methodologies; Classification algorithms; STUDENT PERFORMANCE; MODEL; LOGITBOOST;
D O I
10.1016/j.eswa.2023.121555
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The rapid growth of educational data creates the requirement to mine useful information from learning behavior patterns. The development of data mining technology makes educational data mining possible. The paper intends to use a public educational data set to study learning behavior patterns from the perspective of educational data mining, so as to promote the innovation of educational management. Firstly, in order to reduce the dimension of data analysis that facilitates the improvement in efficiency, principal component analysis is carried out to reduce the number of attributes in the data set. The significant attributes in the rotating principal component matrix rather than principal components which are not closely related to learning behavior patterns are extracted as the research variables. Then, a pseudo statistic is proposed to determine the number of clusters and the preprocessed data set is clustered according to the extracted attributes. The clustering results are applied to add class labels to the data, which is convenient for the later data training. Finally, six classification algorithms J48, K-Nearest Neighbor, Bayes Net, Random Forest, Support Vector Machine and Logit Boost are used to train the data with labels and build prediction models. At the same time, the performance and applicable conditions of six classifiers in terms of accuracy, efficiency, error, and so on are discussed and compared. It is found that the performance of the integrated algorithm is better than that of a single classifier. In the integrated algorithm, compared with Random Forest, the running time of Logit Boost is shorter.
引用
收藏
页数:11
相关论文
共 45 条
[1]  
Agrawal S., 2018, 2018 2 INT C TRENDS, P1308, DOI [10.1109/ICOEI.2018.8553747, DOI 10.1109/ICOEI.2018.8553747]
[2]   Student Dropout Predictive Model Using Data Mining Techniques [J].
Amaya, Y. ;
Barrientos, E. ;
Heredia, D. .
IEEE LATIN AMERICA TRANSACTIONS, 2015, 13 (09) :3127-3134
[3]   Analyzing undergraduate students' performance using educational data mining [J].
Asif, Raheela ;
Merceron, Agathe ;
Ali, Syed Abbas ;
Haider, Najmi Ghani .
COMPUTERS & EDUCATION, 2017, 113 :177-194
[4]   Educational data mining applications and tasks: A survey of the last 10 years [J].
Bakhshinategh, Behdad ;
Zaiane, Osmar R. ;
ElAtia, Samira ;
Ipperciel, Donald .
EDUCATION AND INFORMATION TECHNOLOGIES, 2018, 23 (01) :537-553
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]   The impact of Facebook Addiction and self-esteem on students' academic performance: A multi-group analysis [J].
Busalim, Abdelsalam H. ;
Masrom, Maslin ;
Zakaria, Wan Normeza Binti Wan .
COMPUTERS & EDUCATION, 2019, 142
[7]  
CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
[8]   NEAREST NEIGHBOR PATTERN CLASSIFICATION [J].
COVER, TM ;
HART, PE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) :21-+
[9]  
Crivei LM, 2020, 2020 IEEE 14TH INTERNATIONAL SYMPOSIUM ON APPLIED COMPUTATIONAL INTELLIGENCE AND INFORMATICS (SACI 2020), P11, DOI 10.1109/SACI49304.2020.9118835
[10]   Using sentiment analysis to evaluate qualitative students' responses [J].
Dake, Delali Kwasi ;
Gyimah, Esther .
EDUCATION AND INFORMATION TECHNOLOGIES, 2023, 28 (04) :4629-4647