Clustering with missing and left-censored data: A simulation study comparing multiple-imputation-based procedures

被引:9
|
作者
Faucheux, Lilith [1 ,2 ]
Resche-Rigon, Matthieu [1 ,3 ]
Curis, Emmanuel [3 ,4 ]
Soumelis, Vassili [2 ,5 ]
Chevret, Sylvie [1 ,3 ]
机构
[1] Univ Paris, Sorbonne Paris Cite, ECSTRRA Team, INSERM UMR1153, Paris, France
[2] Univ Paris, Sorbonne Paris Cite, INSERM U976, Paris, France
[3] Hop St Louis, AP HP, Serv Biostat & Informat Med, Paris, France
[4] Univ Paris, Sorbonne Paris Cite, Lab Biomath Plateau IB2 EA 7537 BioSTM, Fac Pharm, Paris, France
[5] Hop St Louis, AP HP, Lab Immunol Biol & Histocompatibil, Paris, France
关键词
breast cancer; clustering; consensus; left-censored data; missing data; multiple imputation; LIMIT; QUANTIFICATION; INFERENCE; IMPACT;
D O I
10.1002/bimj.201900366
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Cluster analysis, commonly used to explore large biomedical datasets, can be challenging, notably due to missing data or left-censored data induced by the sensitivity limits of the biochemical measurement method. Usually, complete-case analysis, simple imputation, or stochastic simple imputation are applied before clustering. More recently, consensus methods following multiple imputation have been proposed. However, they ignore left-censoring and do not allow the number of clusters to vary across the partitions of each imputed dataset. Here, we developed a consensus-based clustering algorithm in which left-censored data are taken into account using a modified multiple imputation method and the number of clusters is estimated for each imputed dataset. A simulation study was conducted to assess the performance in terms of the number of clusters, the percentage of unclassified observations, and the adjusted Rand index. The simulation results showed that the investigated method works well compared to several alternative approaches. A real-world application in breast cancer patients showed that the proposed method may reveal novel clusters of patients.
引用
收藏
页码:372 / 393
页数:22
相关论文
共 50 条
  • [1] Multiple imputation for left-censored biomarker data based on Gibbs sampling method
    Lee, MinJae
    Kong, Lan
    Weissfeld, Lisa
    STATISTICS IN MEDICINE, 2012, 31 (17) : 1838 - 1848
  • [2] Imputation of left-censored data for cluster analysis
    Liu, Yushan
    Brown, Steven D.
    JOURNAL OF CHEMOMETRICS, 2014, 28 (03) : 148 - 160
  • [3] A Two-Step Multiple Imputation for Analysis of Repeated Measures With Left-Censored and Missing Data
    Liu, G. Frank
    Hu, Peter
    Mehrotra, Devan V.
    STATISTICS IN BIOPHARMACEUTICAL RESEARCH, 2013, 5 (02): : 116 - 125
  • [4] Assessing assay agreement estimation for multiple left-censored data: a multiple imputation approach
    Lapidus, Nathanael
    Chevret, Sylvie
    Resche-Rigon, Matthieu
    STATISTICS IN MEDICINE, 2014, 33 (30) : 5298 - 5309
  • [5] Study of imputation procedures for nonparametric density estimation based on missing censored lifetimes
    Efromovich, Sam
    Fuksman, Lirit
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2024, 198
  • [6] zCompositions - R Package for multivariate imputation of left-censored data under a compositional approach
    Palarea-Albaladejo, Javier
    Antoni Martin-Fernandez, Josep
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2015, 143 : 85 - 96
  • [7] Partial distance evidential clustering for missing data with multiple imputation
    Tian, Hong-Peng
    Zhang, Zhen
    KNOWLEDGE-BASED SYSTEMS, 2025, 310
  • [8] Joint modeling of longitudinal and survival data with missing and left-censored time-varying covariates
    Chen, Qingxia
    May, Ryan C.
    Ibrahim, Joseph G.
    Chu, Haitao
    Cole, Stephen R.
    STATISTICS IN MEDICINE, 2014, 33 (26) : 4560 - 4576
  • [9] Missing Data in Marginal Structural Models A Plasmode Simulation Study Comparing Multiple Imputation and Inverse Probability Weighting
    Liu, Shao-Hsien
    Chrysanthopoulou, Stavroula A.
    Chang, Qiuzhi
    Hunnicutt, Jacob N.
    Lapane, Kate L.
    MEDICAL CARE, 2019, 57 (03) : 237 - 243
  • [10] Multiple imputation for missing data in a longitudinal cohort study: a tutorial based on a detailed case study involving imputation of missing outcome data
    Lee, Katherine J.
    Roberts, Gehan
    Doyle, Lex W.
    Anderson, Peter J.
    Carlin, John B.
    INTERNATIONAL JOURNAL OF SOCIAL RESEARCH METHODOLOGY, 2016, 19 (05) : 575 - 591