Imbalanced educational data classification: an effective approach with resampling and random forest

被引:0
作者
Vo Thi Ngoc Chau [1 ]
Nguyen Hua Phung [1 ]
机构
[1] Ho Chi Minh City Univ Technol, Fac Comp Sci & Engn, Ho Chi Minh City, Vietnam
来源
PROCEEDINGS OF 2013 IEEE RIVF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES: RESEARCH, INNOVATION, AND VISION FOR THE FUTURE (RIVF) | 2013年
关键词
imbalanced data classification; resampling; random forest; educational data mining; academic credit system;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Educational data mining is emerging in the data mining research arena. Despite an applied field of data mining techniques and methods, educational data mining is full of challenges that have not been completely resolved. Especially data classification in an academic credit system is a very tough task which must deal with imbalanced issues and missing data on the technical side and tackle the flexibility of the education system leading to the heterogeneity of data on the practical side. In this paper, we present our approach with a hybrid resampling scheme and random forest for the imbalanced educational data classification task with multiple classes based on student's performance. The proposed approach has not yet been available in educational data mining. Besides, it has been extensively proved in our empirical study to be effective for student's final study status prediction and usable in a knowledge-driven educational decision support system.
引用
收藏
页码:135 / 140
页数:6
相关论文
共 21 条
[1]  
Alper Muzaffer Ege, 2012, Proceedings of the 4th International Conference on Computer Supported Education, P222
[2]   An approach for classification of highly imbalanced data using weighting and undersampling [J].
Anand, Ashish ;
Pugalenthi, Ganesan ;
Fogel, Gary B. ;
Suganthan, P. N. .
AMINO ACIDS, 2010, 39 (05) :1385-1391
[3]  
Batista GEAPA, 2005, LECT NOTES COMPUT SC, V3646, P24
[4]  
Bayer J., 2012, PREDICTING DROP OUT
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[7]  
Chen Chao., 2001, Machine Learning, V45, P5
[8]  
El-Manzalawy Y., 2005, WLSVM INTEGRATING LI
[9]  
Han J, 2012, MOR KAUF D, P1
[10]  
Kanellopoulos D., 2008, INT J MANAGEMENT ED, V2, P172