Bayesian posterior predictive p-value of statistical consistency in interlaboratory evaluations

被引:21
作者
Kacker, Raghu N. [1 ]
Forbes, Alistair [2 ]
Kessel, Ruediger [1 ]
Sommer, Klaus-Dieter [3 ]
机构
[1] Natl Inst Stand & Technol, Gaithersburg, MD 20899 USA
[2] Natl Phys Lab, Teddington TW11 0LW, Middx, England
[3] Phys Tech Bundesanstalt, D-38116 Braunschweig, Germany
关键词
Statistical tests;
D O I
10.1088/0026-1394/45/5/004
中图分类号
TH7 [仪器、仪表];
学科分类号
0804 ; 080401 ; 081102 ;
摘要
The results from an interlaboratory evaluation are said to be statistically consistent if they fit a normal (Gaussian) consistency model which postulates that the results have the same unknown expected value and stated variances-covariances. A modern method for checking the fit of a statistical model to the data is posterior predictive checking, which is a Bayesian adaptation of classical hypothesis testing. In this paper we propose the use of posterior predictive checking to check the fit of the normal consistency model to interlaboratory results. If the model fits reasonably then the results may be regarded as statistically consistent. The principle of posterior predictive checking is that the realized results should look plausible under a posterior predictive distribution. A posterior predictive distribution is the conditional distribution of potential results, given the realized results, which could be obtained in contemplated replications of the interlaboratory evaluation under the statistical model. A systematic discrepancy between potential results obtained from the posterior predictive distribution and the realized results indicates a potential failing of the model. One can investigate any number of potential discrepancies between the model and the results. We discuss an overall measure of discrepancy for checking the consistency of a set of interlaboratory results. We also discuss two sets of unilateral and bilateral measures of discrepancy. A unilateral discrepancy measure checks whether the result of a particular laboratory agrees with the statistical consistency model. A bilateral discrepancy measure checks whether the results of a particular pair of laboratories agree with each other. The degree of agreement is quantified by the Bayesian posterior predictive p-value. The unilateral and bilateral measures of discrepancy and their posterior predictive p-values discussed in this paper apply to both correlated and independent interlaboratory results. We suggest that the posterior predicative p-values may be used to assess unilateral and bilateral degrees of agreement in International Committee of Weights and Measures (CIPM) key comparisons.
引用
收藏
页码:512 / 523
页数:12
相关论文
共 14 条
  • [1] [Anonymous], 2021, Bayesian Data Analysis
  • [2] The calculation of errors by the method of least squares
    Birge, RT
    [J]. PHYSICAL REVIEW, 1932, 40 (02): : 207 - 227
  • [3] Cox MG, 2002, METROLOGIA, V39, P589, DOI 10.1088/0026-1394/39/6/10
  • [4] Draper N., 2014, Applied Regression Analysis
  • [5] Evans M., 2000, STAT DISTRIBUTIONS
  • [6] Goebel R., 2000, BIPM20009
  • [7] Harville D.A, 1997, MATRIX ALGEBRA STATI
  • [8] Classical and Bayesian interpretation of the Birge test of consistency and its generalized version for correlated results from interlaboratory evaluations
    Kacker, Raghu N.
    Forbes, Alistair
    Kessel, Ruediger
    Sommer, Klaus-Dieter
    [J]. METROLOGIA, 2008, 45 (03) : 257 - 264
  • [9] Statistical analysis of CIPM key comparisons based on the ISO Guide
    Kacker, RN
    Datla, RU
    Parr, AC
    [J]. METROLOGIA, 2004, 41 (04) : 340 - 352
  • [10] Lee PM., 2012, BAYESIAN STAT INTRO