User preferences based software defect detection algorithms selection using MCDM

被引:71
作者
Peng, Yi [1 ]
Wang, Guoxun [1 ]
Wang, Honggang [2 ]
机构
[1] Univ Elect Sci & Technol China, Sch Management & Econ, Chengdu 610054, Peoples R China
[2] Univ Massachusetts, Dept Elect & Comp Engn, Dartmouth, MA USA
基金
中国国家自然科学基金;
关键词
Algorithm selection; Classification algorithm; Knowledge-driven data mining; Multi-criteria decision making (MCDM); Software defect detection; HIERARCHY PROCESS; CLASSIFIERS; PREDICTION; FRAMEWORK; MODELS; TOPSIS; DEA; TREES;
D O I
10.1016/j.ins.2010.04.019
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A variety of classification algorithms for software defect detection have been developed over the years. How to select an appropriate classifier for a given task is an important issue in Data mining and knowledge discovery (DMKD). Many studies have compared different types of classification algorithms and the performances of these algorithms may vary using different performance measures and under different circumstances. Since the algorithm selection task needs to examine several criteria, such as accuracy, computational time, and misclassification rate, it can be modeled as a multiple criteria decision making (MCDM) problem. The goal of this paper is to use a set of MCDM methods to rank classification algorithms, with empirical results based on the software defect detection datasets. Since the preferences of the decision maker (DM) play an important role in algorithm evaluation and selection, this paper involved the DM during the ranking procedure by assigning user weights to the performance measures. Four MCDM methods are examined using 38 classification algorithms and 13 evaluation criteria over 10 public-domain software defect datasets. The results indicate that the boosting of CART and the boosting of C4.5 decision tree are ranked as the most appropriate algorithms for software defect datasets. Though the MCDM methods provide some conflicting results for the selected software defect datasets, they agree on most top-ranked classification algorithms. (C) 2010 Elsevier Inc. All rights reserved.
引用
收藏
页码:3 / 13
页数:11
相关论文
共 71 条
  • [11] [Anonymous], DECIDE PROMETHEE
  • [12] Baeza-Yates R, 1999, MODERN INFORM RETRIE, V463
  • [13] SOME MODELS FOR ESTIMATING TECHNICAL AND SCALE INEFFICIENCIES IN DATA ENVELOPMENT ANALYSIS
    BANKER, RD
    CHARNES, A
    COOPER, WW
    [J]. MANAGEMENT SCIENCE, 1984, 30 (09) : 1078 - 1092
  • [14] Berrer H., 2000, P PKDD WORKSHOP DATA, P1
  • [15] Bishop CM., 1995, NEURAL NETWORKS PATT
  • [16] Brans J.P., 1982, Laide a la decision: Nature, instrument s et perspectives davenir, P183
  • [17] Breiman L, 1996, MACH LEARN, V24, P123, DOI 10.1023/A:1018054314350
  • [18] MINDFUL: A framework for Meta-INDuctive neuro-FUzzy learning
    Castiello, Ciro
    Castellano, Giovanna
    Fanelli, Anna Maria
    [J]. INFORMATION SCIENCES, 2008, 178 (16) : 3253 - 3274
  • [19] Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem
    Catal, Cagatay
    Diri, Banu
    [J]. INFORMATION SCIENCES, 2009, 179 (08) : 1040 - 1058
  • [20] Satisfaction assessment of multi-objective schedules using neural fuzzy methodology
    Cha, YP
    Jung, MY
    [J]. INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2003, 41 (08) : 1831 - 1849