Evaluating the effectiveness of educational data mining techniques for early prediction of students' academic failure in introductory programming courses

被引:212
作者
Costa, Evandro B. [1 ]
Fonseca, Baldoino [1 ]
Santana, Marcelo Almeida [1 ]
de Araujo, Fabrisia Ferreira [2 ,3 ]
Rego, Joilson [4 ]
机构
[1] Fed Univ Alagoas UFAL, Maceio, Brazil
[2] Fed Inst Alagoas IFAL, Palmeira Dos Indios, AL, Brazil
[3] Univ Fed Campina Grande, Campina Grande, Brazil
[4] Fed Univ Rio Grande Norte UFRN, Natal, RN, Brazil
关键词
Artificial intelligence in education; Automatic instructional planner; Automatic prediction; Educational data mining; Interactive learning environment; Learner modeling; CLASSIFIER; ALGORITHM; DROPOUT;
D O I
10.1016/j.chb.2017.01.047
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
The data about high students' failure rates in introductory programming courses have been alarming many educators, raising a number of important questions regarding prediction aspects. In this paper, we present a comparative study on the effectiveness of educational data mining techniques to early predict students likely to fail in introductory programming courses. Although several works have analyzed these techniques to identify students' academic failures, our study differs from existing ones as follows: (i) we investigate the effectiveness of such techniques to identify students likely to fail at early enough stage for action to be taken to reduce the failure rate; (ii) we analyse the impact of data preprocessing and algorithms fine-tuning tasks, on the effectiveness of the mentioned techniques. In our study we evaluated the effectiveness of four prediction techniques on two different and independent data sources on introductory programming courses available from a Brazilian Public University: one comes from distance education and the other from on-campus. The results showed that the techniques analyzed in our study are able to early identify students likely to fail, the effectiveness of some of these techniques is improved after applying the data preprocessing and/or algorithms fine-tuning, and the support vector machine technique outperforms the other ones in a statistically significant way. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:247 / 256
页数:10
相关论文
共 41 条
[1]  
Ahmad F., 2015, Appl. Math. Sci, V9, P6415, DOI DOI 10.12988/AMS.2015.53289
[2]  
Arora Yojna, 2014, ACM SIGSOFT Software Engineering Notes, V39, DOI 10.1145/2557833.2557842
[3]  
Bayer J., 2012, PREDICTING DROP OUT
[4]  
Bennedsen J., 2007, SIGCSE Bulletin, V39, P32, DOI 10.1145/1272848.1272879
[5]   SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[6]  
Bydzovska H., 2016, COMP ANAL TECHNIQUES, P306
[7]  
Caruana R., 2006, P 23 INT C MACH LEAR, P161, DOI DOI 10.1145/1143844.1143865
[8]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[9]  
Cherkassky V, 1997, IEEE Trans Neural Netw, V8, P1564, DOI 10.1109/TNN.1997.641482
[10]  
CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411