Assessing the reliability of statistical software: Part I

被引:80
作者
McCullough, BD [1 ]
机构
[1] FCC, Washington, DC 20554 USA
关键词
accuracy; benchmarks; random number generator; software testing; StRD;
D O I
10.2307/2685442
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Entry-level tests of the accuracy of statistical software, such as Wilkinson's Statistics Quiz, have long been available, but more advanced collections of tests have not. This article proposes a set of intermediate-level tests focusing on three areas: estimation, both linear and nonlinear; random number generation; and statistical distributions (e.g., for calculating p-values). The complete methodology is described in detail. Convenient methods for summarizing the results are presented, so that an assessment of numerical accuracy can easily be incorporated into a software review.
引用
收藏
页码:358 / 366
页数:9
相关论文
共 59 条
[1]  
[Anonymous], 1980, Statistical Computing
[2]   Statistical software packages for Windows: A market survey [J].
Bankhofer, U ;
Hilbert, A .
STATISTICAL PAPERS, 1997, 38 (04) :393-407
[3]   CERTIFICATION OF ALGORITHM-708 - SIGNIFICANT-DIGIT COMPUTATION OF THE INCOMPLETE BETA [J].
BROWN, BW ;
LEVY, LB .
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 1994, 20 (03) :393-397
[4]  
BROWN BW, 1998, DCDFLIB V1 1
[5]  
Dahlquist G., 1974, NUMERICAL METHODS
[6]   A USERS GUIDE TO NONLINEAR OPTIMIZATION ALGORITHMS [J].
DENNIS, JE .
PROCEEDINGS OF THE IEEE, 1984, 72 (12) :1765-1776
[7]   COMPUTATIONAL EXPERIENCE WITH CONFIDENCE-REGIONS AND CONFIDENCE-INTERVALS FOR NONLINEAR LEAST-SQUARES [J].
DONALDSON, JR ;
SCHNABEL, RB .
TECHNOMETRICS, 1987, 29 (01) :67-82
[8]  
EDDY WF, 1991, FUTURE STAT SOFTWARE
[9]  
ELLIOTT AC, 1989, COLLEGIATE MICROCOMP, V11, P289
[10]   CRITERIA AND CONSIDERATIONS IN EVALUATION OF STATISTICAL PROGRAM PACKAGES [J].
FRANCIS, I ;
HEIBERGER, RM ;
VELLEMAN, PF .
AMERICAN STATISTICIAN, 1975, 29 (01) :52-56