Accommodating Time-Varying Heterogeneity in Risk Estimation under the Cox Model: A Transfer Learning Approach

被引:4
作者
Li, Ziyi [1 ]
Shen, Yu [1 ]
Ning, Jing [1 ]
机构
[1] Univ Texas MD Anderson Canc Ctr, Dept Biostat, Houston, TX 77030 USA
基金
美国国家卫生研究院;
关键词
Cox proportional hazards model; Inflammatory breast cancer; National Cancer Database; Risk assessment; Transfer learning; INFLAMMATORY BREAST-CANCER; LASSO; EFFICIENCY; OUTCOMES; TRENDS; CARE;
D O I
10.1080/01621459.2023.2210336
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Transfer learning has attracted increasing attention in recent years for adaptively borrowing information across different data cohorts in various settings. Cancer registries have been widely used in clinical research because of their easy accessibility and large sample size. Our method is motivated by the question of how to use cancer registry data as a complement to improve the estimation precision of individual risks of death for inflammatory breast cancer (IBC) patients at The University of Texas MD Anderson Cancer Center. When transferring information for risk estimation based on the cancer registries (i.e., source cohort) to a single cancer center (i.e., target cohort), time-varying population heterogeneity needs to be appropriately acknowledged. However, there is no literature on how to adaptively transfer knowledge on risk estimation with time-to-event data from the source cohort to the target cohort while adjusting for time-varying differences in event risks between the two sources. Our goal is to address this statistical challenge by developing a transfer learning approach under the Cox proportional hazards model. To allow data-adaptive levels of information borrowing, we impose Lasso penalties on the discrepancies in regression coefficients and baseline hazard functions between the two cohorts, which are jointly solved in the proposed transfer learning algorithm. As shown in the extensive simulation studies, the proposed method yields more precise individualized risk estimation than using the target cohort alone. Meanwhile, our method demonstrates satisfactory robustness against cohort differences compared with the method that directly combines the target and source data in the Cox model. We develop a more accurate risk estimation model for the MD Anderson IBC cohort given various treatment and baseline covariates, while adaptively borrowing information from the National Cancer Database to improve risk assessment. for this article are available online.
引用
收藏
页码:2276 / 2287
页数:12
相关论文
共 52 条
  • [1] Guided Bayesian imputation to adjust for confounding when combining heterogeneous data sources in comparative effectiveness research
    Antonelli, Joseph
    Zigler, Corwin
    Dominici, Francesca
    [J]. BIOSTATISTICS, 2017, 18 (03) : 553 - 568
  • [2] The National Cancer Data Base: A powerful initiative to improve cancer care in the United States
    Bilimoria, Karl Y.
    Stewart, Andrew K.
    Winchester, David P.
    Ko, Clifford Y.
    [J]. ANNALS OF SURGICAL ONCOLOGY, 2008, 15 (03) : 683 - 690
  • [3] Breslow N., 1972, J R STAT SOC B, V34, P202, DOI [DOI 10.1111/J.2517-6161.1972.TB00900.X, 10.1111/j.2517-6161.1972.tb00900.x]
  • [4] Multimodel inference - understanding AIC and BIC in model selection
    Burnham, KP
    Anderson, DR
    [J]. SOCIOLOGICAL METHODS & RESEARCH, 2004, 33 (02) : 261 - 304
  • [5] TRANSFER LEARNING FOR NONPARAMETRIC CLASSIFICATION: MINIMAX RATE AND ADAPTIVE CLASSIFIER
    Cai, T. Tony
    Wei, Hongji
    [J]. ANNALS OF STATISTICS, 2021, 49 (01) : 100 - 128
  • [6] High Age Predicts Low Referral of Hyperthyroid Patients to Specialized Hospital Departments: Evidence for Referral Bias
    Carle, Allan
    Pedersen, Inge Bulow
    Perrild, Hans
    Ovesen, Lars
    Jorgensen, Torben
    Laurberg, Peter
    [J]. THYROID, 2013, 23 (12) : 1518 - 1524
  • [7] Trends in incidence and prognosis for head and neck cancer in the United States: A site-specific analysis of the SEER database
    Carvalho, AL
    Nishimoto, IN
    Califano, JA
    Kowalski, LP
    [J]. INTERNATIONAL JOURNAL OF CANCER, 2005, 114 (05) : 806 - 816
  • [8] Bootstrapping Lasso Estimators
    Chatterjee, A.
    Lahiri, S. N.
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (494) : 608 - 625
  • [9] Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-Level Information From External Big Data Sources
    Chatterjee, Nilanjan
    Chen, Yi-Hau
    Maas, Paige
    Carroll, Raymond J.
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2016, 111 (513) : 107 - 117
  • [10] Combining primary cohort data with external aggregate information without assuming comparability
    Chen, Ziqi
    Ning, Jing
    Shen, Yu
    Qin, Jing
    [J]. BIOMETRICS, 2021, 77 (03) : 1024 - 1036