Cost-sensitive boosting in software quality modeling

被引:31
作者
Khoshgoftaar, TM [1 ]
机构
[1] Florida Atlantic Univ, Empir Software Engn Lab, Dept Comp Sci & Engn, Boca Raton, FL 33431 USA
来源
7TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH ASSURANCE SYSTEMS ENGINEERING, PROCEEDINGS | 2002年
关键词
software quality modeling; C4.5; decision stumps; cost-sensitivity; boosting; cost-boosting;
D O I
10.1109/HASE.2002.1173102
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Early prediction of the quality of software modules prior to software testing and operations can yield great benefits to the software development teams, especially those of high-assurance and mission-critical systems. Such an estimation allows effective use of the testing resources to improve the modules of the software system that need it most and achieve high reliability. To achieve high reliability, by the means of predictive methods, several tools are available. Software classification models provide a prediction of the class of a module, i.e., fault-prone or not fault-prone. Recent advances in the data mining field allow to improve individual classifiers (models) by using the combined decision from multiple classifiers. This paper presents a couple of algorithms using the concept of combined classification. The algorithms provided useful models for software quality modeling. A comprehensive comparative evaluation of the Boosting and Cost-Boosting algorithms is presented. We demonstrate how the use of boosting algorithms (original and cost-sensitive) meets many of the specific requirements for software quality modeling. C4.5 decision trees and Decision Stumps were used to evaluate these algorithms with two large-scale case studies of industrial software systems.
引用
收藏
页码:51 / 60
页数:10
相关论文
共 26 条
[1]  
Breiman L, 1998, ANN STAT, V26, P801
[2]   Bagging predictors [J].
Breiman, L .
MACHINE LEARNING, 1996, 24 (02) :123-140
[3]  
Breiman L., 1984, BIOMETRICS, DOI DOI 10.2307/2530946
[4]   An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization [J].
Dietterich, TG .
MACHINE LEARNING, 2000, 40 (02) :139-157
[5]  
Drucker H, 1996, ADV NEUR IN, V8, P479
[6]  
Fisher D. H., 1987, Machine Learning, V2, P139, DOI 10.1007/BF00114265
[7]  
FREUND, 1994, LNCS
[8]  
Freund Y., 1999, Journal of Japanese Society for Artificial Intelligence, V14, P771
[9]  
Freund Y, 1996, Experiments with a new boosting algorithm. In proceedings 13th Int Conf Mach learn. Pp.148-156, P45
[10]  
FRIEDMAN J, 1999, ADDITIVE LOGISTIC RE