PRINCIPAL COMPONENTS ANALYSIS FOR RIGHT CENSORED DATA

被引:1
作者
Langworthy, Benjamin W. [1 ]
Cai, Jianwen [1 ]
Corty, Robert W. [2 ]
Kosorok, Michael R. [1 ]
Fine, Jason P. [2 ]
机构
[1] Univ North Carolina Chapel Hill, Dept Biostat, Chapel Hill, NC 27599 USA
[2] Univ North Carolina Chapel Hill, Sch Med, Chapel Hill, NC 27599 USA
关键词
Competing risks; multivariate survival analysis; principal components analysis; BIVARIATE SURVIVAL FUNCTION;
D O I
10.5705/ss.202021.0087
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Principal components analysis (PCA) is a common dimension-reduction tool that transforms a set of variables into a linearly uncorrelated set of variables. Standard PCA estimators involve either the eigendecomposition of the estimated covariance matrix or a singular value decomposition of the centered data. However, for right-censored failure time data, estimating the principal components in this way is not straightforward because not all failure times are observed. Standard estimators for the covariance or correlation matrix should not be used in this case, because they require strong assumptions on the form of the joint distribution and on the marginal distributions beyond the final observation time. We present a novel, nonparametric estimator for the covariance of multivariate right-censored failure time data based on the counting processes and corresponding martingales defined by the failure times. We prove that these estimators are consistent and converge to a Gaussian process when properly standardized. We further show that these covariance estimates can be used to estimate a PCA for the martingales and counting processes for the different failure times. The corresponding estimates of the principal directions are consistent and asymptotically normal. We apply this method to data from a clinical trial of patients with pancreatic cancer, and recover a medically valid low-dimensional representation of adverse events.
引用
收藏
页码:1985 / 2016
页数:32
相关论文
共 21 条
  • [1] Anderson T. W., 2003, INTRO MULTIVARIATE S
  • [2] A Dvoretzky-Kiefer-Wolfowitz type inequality for the Kaplan-Meier estimator
    Bitouzé, D
    Laurent, B
    Massart, P
    [J]. ANNALES DE L INSTITUT HENRI POINCARE-PROBABILITES ET STATISTIQUES, 1999, 35 (06): : 735 - 763
  • [3] Nonparametric association analysis of bivariate competing-risks data
    Cheng, Yu
    Fine, Jason P.
    Kosorok, Michael R.
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2007, 102 (480) : 1407 - 1415
  • [4] KAPLAN-MEIER ESTIMATE ON THE PLANE
    DABROWSKA, DM
    [J]. ANNALS OF STATISTICS, 1988, 16 (04) : 1475 - 1489
  • [5] On semi-competing risks data
    Fine, JP
    Jiang, H
    Chappell, R
    [J]. BIOMETRIKA, 2001, 88 (04) : 907 - 919
  • [6] GILL RD, 1995, ANN I H POINCARE-PR, V31, P545
  • [7] GILLESPIE B, 1992, BIOMETRIKA, V79, P149
  • [8] Beyond Composite Endpoints Analysis: Semicompeting Risks as an Underutilized Framework for Cancer Research
    Jazic, Ina
    Schrag, Deborah
    Sargent, Daniel J.
    Haneuse, Sebastien
    [J]. JNCI-JOURNAL OF THE NATIONAL CANCER INSTITUTE, 2016, 108 (12):
  • [9] Kalbfleisch JD, 1980, The statistical analysis of failure time data
  • [10] KARNOFSKY DA, 1948, CANCER-AM CANCER SOC, V1, P634, DOI 10.1002/1097-0142(194811)1:4<634::AID-CNCR2820010410>3.0.CO