A systematic review of effect size in software engineering experiments

被引:226
作者
Kampenes, Vigdis By
Dyba, Tore
Hannay, Jo E.
Sjoberg, Dag I. K.
机构
[1] Dept Software Engn, Simula Res Lab, NO-1325 Lysaker, Norway
[2] Univ Oslo, Dept Informat, NO-0316 Oslo, Norway
[3] SINTEF ICT, NO-7465 Trondheim, Norway
关键词
empirical software engineering; controlled experiments; effect size; statistical significance; practical importance;
D O I
10.1016/j.infsof.2007.02.015
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An effect size quantifies the effects of an experimental treatment. Conclusions drawn from hypothesis testing results might be erroneous if effect sizes are not judged in addition to statistical significance. This paper reports a systematic review of 92 controlled experiments published in 12 major software engineering journals and conference proceedings in the decade 1993-2002. The review investigates the practice of effect size reporting, summarizes standardized effect sizes detected in the experiments, discusses the results and gives advice for improvements. Standardized and/or unstandardized effect sizes were reported in 29% of the experiments. Interpretations of the effect sizes in terms of practical importance were not discussed beyond references to standard conventions. The standardized effect sizes computed from the reviewed experiments were equal to observations in psychology studies and slightly larger than standard conventions in behavioral science. (C) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:1073 / 1086
页数:14
相关论文
共 45 条
[21]   Statistical practices of educational researchers: An analysis of their ANOVA, MANOVA, and ANCOVA analyses [J].
Keselman, HJ ;
Huberty, CJ ;
Lix, LM ;
Olejnik, S ;
Cribbie, RA ;
Donahue, B ;
Kowalchuk, RK ;
Lowman, LL ;
Petoskey, MD ;
Keselman, JC ;
Levin, JR .
REVIEW OF EDUCATIONAL RESEARCH, 1998, 68 (03) :350-386
[22]   Practical significance: A concept whose time has come [J].
Kirk, RE .
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1996, 56 (05) :746-759
[23]  
KITCHENHAM B. A., 2004, PROCEDURES PERFORMIN
[24]   Preliminary guidelines for empirical research in software engineering [J].
Kitchenham, BA ;
Pfleeger, SL ;
Pickard, LM ;
Jones, PW ;
Hoaglin, DC ;
El Emam, K ;
Rosenberg, J .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2002, 28 (08) :721-734
[25]  
Kline RexB., 2004, SIGNIFICANCE TESTING
[26]   A NONPARAMETRIC TECHNIQUE FOR META-ANALYSIS EFFECT SIZE CALCULATION [J].
KRAEMER, HC ;
ANDREWS, G .
PSYCHOLOGICAL BULLETIN, 1982, 91 (02) :404-412
[27]   NONPARAMETRIC EFFECT SIZE ESTIMATION - A COMMENT ON KRAEMER AND ANDREWS [J].
KRAUTH, J .
PSYCHOLOGICAL BULLETIN, 1983, 94 (01) :190-192
[28]  
Lipsey M.W., 1990, Design sensitivity: Statistical power for experimental research, V19
[29]  
Lipsey M. W., 2001, PRACTICAL METAANALYS
[30]   Applying meta-analytical procedures to software engineering experiments [J].
Miller, J .
JOURNAL OF SYSTEMS AND SOFTWARE, 2000, 54 (01) :29-39