Software effort models should be assessed via leave-one-out validation

被引:105
作者
Kocaguneli, Ekrem [1 ]
Menzies, Tim [1 ]
机构
[1] W Virginia Univ, CSEE, Morgantown, WV 26506 USA
关键词
Software cost estimation; Prediction system; Bias; Variance; COST ESTIMATION; PREDICTION; SELECTION; SYSTEMS;
D O I
10.1016/j.jss.2013.02.053
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Context: More than half the literature on software effort estimation (SEE) focuses on model comparisons. Each of those requires a sampling method (SM) to generate the train and test sets. Different authors use different SMs such as leave-one-out (LOO), 3Way and 10Way cross-validation. While LOO is a deterministic algorithm, the N-way methods use random selection to build their train and test sets. This introduces the problem of conclusion instability where different authors rank effort estimators in different ways. Objective: To reduce conclusion instability by removing the effects of a sampling method's random test case generation. Method: Calculate bias and variance (B&V) values following the assumption that a learner trained on the whole dataset is taken as the true model; then demonstrate that the MAT and runtime values for LOO are similar to N-way by running 90 different algorithms on 20 different SEE datasets. For each algorithm, collect runtimes, B&V values under LOO, 3Way and 10Way. Results: We observed that: (1) the majority of the algorithms have statistically indistinguishable B&V values under different SMs and (2) different SMs have similar run times. Conclusion: In terms of their generated B&V values and runtimes, there is no reason to prefer N-way over LOO. In terms of reproducibility, LOO removes one cause of conclusion instability (the random selection of train and test sets). Therefore, we depreciate N-way and endorse LOO validation for assessing effort models. (C) 2013 Elsevier Inc. All rights reserved.
引用
收藏
页码:1879 / 1890
页数:12
相关论文
共 47 条
  • [1] SOFTWARE FUNCTION, SOURCE LINES OF CODE, AND DEVELOPMENT EFFORT PREDICTION - A SOFTWARE SCIENCE VALIDATION
    ALBRECHT, AJ
    GAFFNEY, JE
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1983, 9 (06) : 639 - 648
  • [2] [Anonymous], 2004, Introduction to Machine Learning
  • [3] [Anonymous], AIPR 09
  • [4] [Anonymous], IEEE T SOFTWARE ENG
  • [5] [Anonymous], P ENG INT SYST
  • [6] [Anonymous], P JOINT C INT SOC PA
  • [7] [Anonymous], 1984, OLSHEN STONE CLASSIF, DOI 10.2307/2530946
  • [8] [Anonymous], THESIS W VIRGINIA U
  • [9] [Anonymous], 2002, Applied Statistics for Software Managers
  • [10] [Anonymous], 1987, Multiple comparison procedures