Estimating the agreement and diagnostic accuracy of two diagnostic tests when one test is conducted on only a subsample of specimens

被引:21
|
作者
Katki, Hormuzd A. [1 ]
Li, Yan [2 ]
Edelstein, David W. [3 ]
Castle, Philip E. [4 ]
机构
[1] NCI, Div Canc Epidemiol & Genet, Rockville, MD USA
[2] Univ Texas Arlington, Dept Math, Arlington, TX 76019 USA
[3] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[4] Amer Soc Clin Pathologists, Washington, DC USA
关键词
verification bias; symmetry test; kappa; two-phase design; HPV; sensitivity; specificity; gold standard; DOUBLE SAMPLING SCHEME; DISEASE VERIFICATION; GOLD STANDARD; BINOMIAL DATA; SENSITIVITY; SPECIFICITY; DESIGNS; 2-STAGE; ERROR; BIAS;
D O I
10.1002/sim.4422
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We focus on the efficient usage of specimen repositories for the evaluation of new diagnostic tests and for comparing new tests with existing tests. Typically, all pre-existing diagnostic tests will already have been conducted on all specimens. However, we propose retesting only a judicious subsample of the specimens by the new diagnostic test. Subsampling minimizes study costs and specimen consumption, yet estimates of agreement or diagnostic accuracy potentially retain adequate statistical efficiency. We introduce methods to estimate agreement statistics and conduct symmetry tests when the second test is conducted on only a subsample and no gold standard exists. The methods treat the subsample as a stratified two-phase sample and use inverse-probability weighting. Strata can be any information available on all specimens and can be used to oversample the most informative specimens. The verification bias framework applies if the test conducted on only the subsample is a gold standard. We also present inverse-probability-weighting-based estimators of diagnostic accuracy that take advantage of stratification. We present three examples demonstrating that adequate statistical efficiency can be achieved under subsampling while greatly reducing the number of specimens requiring retesting. Naively using standard estimators that ignore subsampling can lead to drastically misleading estimates. Through simulation, we assess the finite-sample properties of our estimators and consider other possible sampling designs for our examples that could have further improved statistical efficiency. To help promote subsampling designs, our R package CompareTests computes all of our agreement and diagnostic accuracy statistics. Copyright (c) 2011 John Wiley & Sons, Ltd.
引用
收藏
页码:436 / 448
页数:13
相关论文
共 49 条
  • [11] EM algorithm for comparing two binary diagnostic tests when not all the patients are verified
    Roldan Nofuentes, J. A.
    Luna del Castillo, J. D.
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2008, 78 (01) : 19 - 35
  • [12] Estimating diagnostic test accuracy for infectious salmon anaemia virus in Maine, USA
    Gustafson, L.
    Ellis, S.
    Bouchard, D.
    Robinson, T.
    Marenghi, F.
    Warg, J.
    Giray, C.
    JOURNAL OF FISH DISEASES, 2008, 31 (02) : 117 - 125
  • [13] Diagnostic Accuracy of a Qualitative Fecal Immunochemical Test Varies With Location of Neoplasia But Not Number of Specimens
    Wong, Martin C. S.
    Ching, Jessica Y. L.
    Chan, Victor C. W.
    Lam, Thomas Y. T.
    Shum, Jeffrey P.
    Luk, Arthur K. C.
    Wong, Sunny S. H.
    Ng, Siew C.
    Ng, Simon S. M.
    Wu, Justin C. Y.
    Chan, Francis K. L.
    Sung, Joseph J. Y.
    CLINICAL GASTROENTEROLOGY AND HEPATOLOGY, 2015, 13 (08) : 1472 - 1479
  • [14] On simultaneous assessment of sensitivity and specificity when combining two diagnostic tests
    Tang, ML
    STATISTICS IN MEDICINE, 2004, 23 (23) : 3593 - 3605
  • [15] On implementation of the Gibbs sampler for estimating the accuracy of multiple diagnostic tests
    Principato, Fabio
    Vullo, Angela
    Matranga, Domenica
    JOURNAL OF APPLIED STATISTICS, 2010, 37 (08) : 1335 - 1354
  • [16] Comparison of three rapid influenza diagnostic tests with digital readout systems and one conventional rapid influenza diagnostic test
    Ryu, Sook Won
    Suh, In Bum
    Ryu, Se-Min
    Shin, Kyu Sung
    Kim, Hyon-Suk
    Kim, Juwon
    Uh, Young
    Yoon, Kap Jun
    Lee, Jong-Han
    JOURNAL OF CLINICAL LABORATORY ANALYSIS, 2018, 32 (02)
  • [17] Accuracy of screening tests for gestational diabetes mellitus in Southeast Asia A systematic review of diagnostic test accuracy studies
    Lappharat, Sattamat
    Liabsuetrakul, Tippawan
    MEDICINE, 2020, 99 (46) : E23161
  • [18] Recommended methods to compare the accuracy of two binary diagnostic tests subject to a paired design
    Roldan-Nofuentes, J. A.
    Sidaty-Regad, S. B.
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2019, 89 (14) : 2621 - 2644
  • [19] Using a Web-Based Application to Define the Accuracy of Diagnostic Tests When the Gold Standard Is Imperfect
    Lim, Cherry
    Wannapinij, Prapass
    White, Lisa
    Day, Nicholas P. J.
    Cooper, Ben S.
    Peacock, Sharon J.
    Limmathurotsakul, Direk
    PLOS ONE, 2013, 8 (11):
  • [20] Adjusting for verification bias in diagnostic accuracy measures when comparing multiple screening tests - an application to the IP1-PROSTAGRAM study
    Emily Day
    David Eldred-Evans
    A. Toby Prevost
    Hashim U. Ahmed
    Francesca Fiorentino
    BMC Medical Research Methodology, 22