Developing biomarker combinations in multicenter studies via direct maximization and penalization

被引:1
作者
Meisner, Allison [1 ]
Parikh, Chirag R. [2 ]
Kerr, Kathleen F. [3 ]
机构
[1] Johns Hopkins Univ, Dept Biostat, Baltimore, MD 21205 USA
[2] Johns Hopkins Univ, Div Nephrol, Baltimore, MD USA
[3] Univ Washington, Dept Biostat, Seattle, WA 98195 USA
关键词
adjusted AUC; biomarker combinations; multicenter; penalization; CLASSIFICATION; ACCURACY; MARKERS; CURVE; AREA;
D O I
10.1002/sim.8673
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Motivated by a study of acute kidney injury, we consider the setting of biomarker studies involving patients at multiple centers where the goal is to develop a biomarker combination for diagnosis, prognosis, or screening. As biomarker studies become larger, this type of data structure will be encountered more frequently. In the presence of multiple centers, one way to assess the predictive capacity of a given combination is to consider the center-adjusted area under the receiver operating characteristic curve (aAUC), a summary of the ability of the combination to discriminate between cases and controls in each center. Rather than using a general method, such as logistic regression, to construct the biomarker combination, we propose directly maximizing the aAUC. Furthermore, it may be desirable to have a biomarker combination with similar performance across centers. To that end, we allow for penalization of the variability in the center-specific AUCs. We demonstrate desirable asymptotic properties of the resulting combinations. Simulations provide small-sample evidence that maximizing the aAUC can lead to combinations with improved performance. We also use simulated data to illustrate the utility of constructing combinations by maximizing the aAUC while penalizing variability. Finally, we apply these methods to data from the study of acute kidney injury.
引用
收藏
页码:3412 / 3426
页数:15
相关论文
共 28 条
  • [1] [Anonymous], 2009, The Elements of Statistical Learning: Data Mining, Inference, and Prediction
  • [2] BIANCO A. M., 1996, ROBUST STAT DATA ANA, P17, DOI DOI 10.1007/978-1-4612-2380-1_2
  • [3] Internal Validation of Risk Models in Clustered Data: A Comparison of Bootstrap Schemes
    Bouwmeester, W.
    Moons, K. G. M.
    Kappen, T. H.
    van Klei, W. A.
    Twisk, J. W. R.
    Eijkemans, M. J. C.
    Vergouwe, Y.
    [J]. AMERICAN JOURNAL OF EPIDEMIOLOGY, 2013, 177 (11) : 1209 - 1217
  • [4] Overestimation of the receiver operating characteristic curve for logistic regression
    Copas, JB
    Corbett, P
    [J]. BIOMETRIKA, 2002, 89 (02) : 315 - 331
  • [5] Diagnostic accuracy of FibroScan and comparison to liver fibrosis biomarkers in chronic viral hepatitis: A multicenter prospective study (the FIBROSTIC study)
    Degos, Francoise
    Perez, Paul
    Roche, Bruno
    Mahmoudi, Amel
    Asselineau, Julien
    Voitot, Helene
    Bedossa, Pierre
    [J]. JOURNAL OF HEPATOLOGY, 2010, 53 (06) : 1013 - 1021
  • [6] Cytokeratin-18 Fragment Levels as Noninvasive Biomarkers for Nonalcoholic Steatohepatitis: A Multicenter Validation Study
    Feldstein, Ariel E.
    Wieckowska, Anna
    Lopez, A. Rocio
    Liu, Yao-Chang
    Zein, Nizar N.
    McCullough, Arthur J.
    [J]. HEPATOLOGY, 2009, 50 (04) : 1072 - 1078
  • [7] Combining biomarkers linearly and nonlinearly for classification using the area under the ROC curve
    Fong, Youyi
    Yin, Shuxin
    Huang, Ying
    [J]. STATISTICS IN MEDICINE, 2016, 35 (21) : 3792 - 3809
  • [8] Gao F., 2008, Journal of Data Science, V6, P105
  • [9] Harrell FE, 2001, REGRESSION MODELING
  • [10] Adjusting for covariates in studies of diagnostic, screening, or prognostic markers: An old concept in a new setting
    Janes, Holly
    Pepe, Margaret S.
    [J]. AMERICAN JOURNAL OF EPIDEMIOLOGY, 2008, 168 (01) : 89 - 97