User preferences based software defect detection algorithms selection using MCDM

被引:71
作者
Peng, Yi [1 ]
Wang, Guoxun [1 ]
Wang, Honggang [2 ]
机构
[1] Univ Elect Sci & Technol China, Sch Management & Econ, Chengdu 610054, Peoples R China
[2] Univ Massachusetts, Dept Elect & Comp Engn, Dartmouth, MA USA
基金
中国国家自然科学基金;
关键词
Algorithm selection; Classification algorithm; Knowledge-driven data mining; Multi-criteria decision making (MCDM); Software defect detection; HIERARCHY PROCESS; CLASSIFIERS; PREDICTION; FRAMEWORK; MODELS; TOPSIS; DEA; TREES;
D O I
10.1016/j.ins.2010.04.019
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A variety of classification algorithms for software defect detection have been developed over the years. How to select an appropriate classifier for a given task is an important issue in Data mining and knowledge discovery (DMKD). Many studies have compared different types of classification algorithms and the performances of these algorithms may vary using different performance measures and under different circumstances. Since the algorithm selection task needs to examine several criteria, such as accuracy, computational time, and misclassification rate, it can be modeled as a multiple criteria decision making (MCDM) problem. The goal of this paper is to use a set of MCDM methods to rank classification algorithms, with empirical results based on the software defect detection datasets. Since the preferences of the decision maker (DM) play an important role in algorithm evaluation and selection, this paper involved the DM during the ranking procedure by assigning user weights to the performance measures. Four MCDM methods are examined using 38 classification algorithms and 13 evaluation criteria over 10 public-domain software defect datasets. The results indicate that the boosting of CART and the boosting of C4.5 decision tree are ranked as the most appropriate algorithms for software defect datasets. Though the MCDM methods provide some conflicting results for the selected software defect datasets, they agree on most top-ranked classification algorithms. (C) 2010 Elsevier Inc. All rights reserved.
引用
收藏
页码:3 / 13
页数:11
相关论文
共 71 条
  • [1] Extensions of TOPSIS for multi-objective large-scale nonlinear programming problems
    Abo-Sinna, MA
    Amer, AH
    [J]. APPLIED MATHEMATICS AND COMPUTATION, 2005, 162 (01) : 243 - 256
  • [2] USING THE ANALYTICAL HIERARCHY PROCESS IN SELECTING COMMERCIAL REAL-TIME OPERATING SYSTEMS
    Ahmad, Norita
    Laplante, Phillip A.
    [J]. INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & DECISION MAKING, 2009, 8 (01) : 151 - 168
  • [3] [Anonymous], 1991, Nearest neighbor (NN) norms: NN pattern classification techniques
  • [4] [Anonymous], 2004, COMBINING PATTERN CL, DOI DOI 10.1002/0471660264
  • [5] [Anonymous], 1992, ML92
  • [6] [Anonymous], 2014, C4. 5: programs for machine learning
  • [7] [Anonymous], 1984, OLSHEN STONE CLASSIF, DOI 10.2307/2530946
  • [8] [Anonymous], 2004, METRICS DATA PROGRAM
  • [9] [Anonymous], 2004, P WORKSH PRED SOFTW
  • [10] [Anonymous], 2011, Pei. data mining concepts and techniques