Software effort models should be assessed via leave-one-out validation

被引:109
作者
Kocaguneli, Ekrem [1 ]
Menzies, Tim [1 ]
机构
[1] W Virginia Univ, CSEE, Morgantown, WV 26506 USA
关键词
Software cost estimation; Prediction system; Bias; Variance; COST ESTIMATION; PREDICTION; SELECTION; SYSTEMS;
D O I
10.1016/j.jss.2013.02.053
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Context: More than half the literature on software effort estimation (SEE) focuses on model comparisons. Each of those requires a sampling method (SM) to generate the train and test sets. Different authors use different SMs such as leave-one-out (LOO), 3Way and 10Way cross-validation. While LOO is a deterministic algorithm, the N-way methods use random selection to build their train and test sets. This introduces the problem of conclusion instability where different authors rank effort estimators in different ways. Objective: To reduce conclusion instability by removing the effects of a sampling method's random test case generation. Method: Calculate bias and variance (B&V) values following the assumption that a learner trained on the whole dataset is taken as the true model; then demonstrate that the MAT and runtime values for LOO are similar to N-way by running 90 different algorithms on 20 different SEE datasets. For each algorithm, collect runtimes, B&V values under LOO, 3Way and 10Way. Results: We observed that: (1) the majority of the algorithms have statistically indistinguishable B&V values under different SMs and (2) different SMs have similar run times. Conclusion: In terms of their generated B&V values and runtimes, there is no reason to prefer N-way over LOO. In terms of reproducibility, LOO removes one cause of conclusion instability (the random selection of train and test sets). Therefore, we depreciate N-way and endorse LOO validation for assessing effort models. (C) 2013 Elsevier Inc. All rights reserved.
引用
收藏
页码:1879 / 1890
页数:12
相关论文
共 47 条
[1]   SOFTWARE FUNCTION, SOURCE LINES OF CODE, AND DEVELOPMENT EFFORT PREDICTION - A SOFTWARE SCIENCE VALIDATION [J].
ALBRECHT, AJ ;
GAFFNEY, JE .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1983, 9 (06) :639-648
[2]  
[Anonymous], 2004, Introduction to Machine Learning
[3]  
[Anonymous], AIPR 09
[4]  
[Anonymous], IEEE T SOFTWARE ENG
[5]  
[Anonymous], P ENG INT SYST
[6]  
[Anonymous], P JOINT C INT SOC PA
[7]  
[Anonymous], 1984, OLSHEN STONE CLASSIF, DOI 10.2307/2530946
[8]  
[Anonymous], THESIS W VIRGINIA U
[9]  
[Anonymous], 2002, Applied Statistics for Software Managers
[10]  
[Anonymous], 1987, Multiple comparison procedures