Evaluating prediction systems in software project estimation

被引:259
作者
Shepperd, Martin [1 ]
MacDonell, Steve [2 ]
机构
[1] Brunel Univ, Dept IS & Comp, Uxbridge UB83PH, Middx, England
[2] Auckland Univ Technol, Dept Comp & Math Sci, Auckland 1142, New Zealand
基金
英国工程与自然科学研究理事会;
关键词
Software engineering; Prediction system; Empirical validation; Randomisation techniques; REGRESSION; ACCURACY; ANALOGY;
D O I
10.1016/j.infsof.2011.12.008
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Software engineering has a problem in that when we empirically evaluate competing prediction systems we obtain conflicting results. Objective: To reduce the inconsistency amongst validation study results and provide a more formal foundation to interpret results with a particular focus on continuous prediction systems. Method: A new framework is proposed for evaluating competing prediction systems based upon (1) an unbiased statistic, Standardised Accuracy, (2) testing the result likelihood relative to the baseline technique of random 'predictions', that is guessing, and (3) calculation of effect sizes. Results: Previously published empirical evaluations of prediction systems are re-examined and the original conclusions shown to be unsafe. Additionally, even the strongest results are shown to have no more than a medium effect size relative to random guessing. Conclusions: Biased accuracy statistics such as MMRE are deprecated. By contrast this new empirical validation framework leads to meaningful results. Such steps will assist in performing future meta-analyses and in providing more robust and usable recommendations to practitioners. (C) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:820 / 827
页数:8
相关论文
共 39 条
[1]  
[Anonymous], 11 IEEE INT SOFTW ME
[2]  
Bi J., 2003, P 20 INT C MACH LEAR, P43
[3]  
Briand LC, 1999, P 21 INT C SOFTW ENG, DOI DOI 10.1145/302405.302647
[4]   CASE AGAINST STATISTICAL SIGNIFICANCE TESTING [J].
CARVER, RP .
HARVARD EDUCATIONAL REVIEW, 1978, 48 (03) :378-399
[5]  
Coe Robert, 2002, ITS EFFECT SIZE STUP, P1
[6]   A POWER PRIMER [J].
COHEN, J .
PSYCHOLOGICAL BULLETIN, 1992, 112 (01) :155-159
[7]  
Davey B. A., 2002, Introduction to Lattices and Order, DOI DOI 10.1017/CBO9780511809088
[8]  
Ellis PD, 2010, ESSENTIAL GUIDE TO EFFECT SIZES: STATISTICAL POWER, META-ANALYSIS AND THE INTERPRETATION OF RESEARCH RESULTS, P1
[9]   An introduction to ROC analysis [J].
Fawcett, Tom .
PATTERN RECOGNITION LETTERS, 2006, 27 (08) :861-874
[10]   A simulation study of the model evaluation criterion MMRE [J].
Foss, T ;
Stensrud, E ;
Kitchenham, B ;
Myrtveit, I .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2003, 29 (11) :985-995