Prognosis of lasso-like penalized Cox models with tumor profiling improves prediction over clinical data alone and benefits from bi-dimensional pre-screening

被引:11
作者
Jardillier, Remy [1 ,2 ]
Koca, Dzenis [1 ]
Chatelain, Florent [2 ]
Guyon, Laurent [1 ]
机构
[1] Univ Grenoble Alpes, CEA, INSERM, Biosante U1292,IRIG, Grenoble, France
[2] Univ Grenoble Alpes, Inst Engn, GIPSA Lab, Grenoble INP,CNRS, Grenoble, France
关键词
Cox model; Prediction; Survival model; Penalized regression; Lasso; RNA-seq; Cancer; GENERALIZED LINEAR-MODELS; EXPRESSION ANALYSIS; VARIABLE SELECTION; SURVIVAL ANALYSIS; CANCER PROGNOSIS; ELASTIC-NET; REGULARIZATION; LIKELIHOOD; REGRESSION; PACKAGE;
D O I
10.1186/s12885-022-10117-1
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Background Prediction of patient survival from tumor molecular '-omics' data is a key step toward personalized medicine. Cox models performed on RNA profiling datasets are popular for clinical outcome predictions. But these models are applied in the context of "high dimension", as the number p of covariates (gene expressions) greatly exceeds the number n of patients and e of events. Thus, pre-screening together with penalization methods are widely used for dimensional reduction. Methods In the present paper, (i) we benchmark the performance of the lasso penalization and three variants (i.e., ridge, elastic net, adaptive elastic net) on 16 cancers from TCGA after pre-screening, (ii) we propose a bi-dimensional pre-screening procedure based on both gene variability and p-values from single variable Cox models to predict survival, and (iii) we compare our results with iterative sure independence screening (ISIS). Results First, we show that integration of mRNA-seq data with clinical data improves predictions over clinical data alone. Second, our bi-dimensional pre-screening procedure can only improve, in moderation, the C-index and/or the integrated Brier score, while excluding irrelevant genes for prediction. We demonstrate that the different penalization methods reached comparable prediction performances, with slight differences among datasets. Finally, we provide advice in the case of multi-omics data integration. Conclusions Tumor profiles convey more prognostic information than clinical variables such as stage for many cancer subtypes. Lasso and Ridge penalizations perform similarly than Elastic Net penalizations for Cox models in high-dimension. Pre-screening of the top 200 genes in term of single variable Cox model p-values is a practical way to reduce dimension, which may be particularly useful when integrating multi-omics.
引用
收藏
页数:16
相关论文
共 73 条
[1]   Differential expression analysis for sequence count data [J].
Anders, Simon ;
Huber, Wolfgang .
GENOME BIOLOGY, 2010, 11 (10)
[2]   Systematic pan-cancer analysis of tumour purity [J].
Aran, Dvir ;
Sirota, Marina ;
Butte, Atul J. .
NATURE COMMUNICATIONS, 2015, 6
[3]   On fusion methods for knowledge discovery from multi-omics datasets [J].
Baldwin, Edwin ;
Han, Jiali ;
Luo, Wenting ;
Zhou, Jin ;
An, Lingling ;
Liu, Jian ;
Zhang, Hao Helen ;
Li, Haiquan .
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2020, 18 :509-517
[4]   Accounting for grouped predictor variables or pathways in high-dimensional penalized Cox regression models [J].
Belhechmi, Shaima ;
De Bin, Riccardo ;
Rotolo, Federico ;
Michiels, Stefan .
BMC BIOINFORMATICS, 2020, 21 (01)
[5]   Generating survival times to simulate Cox proportional hazards models [J].
Bender, R ;
Augustin, T ;
Blettner, M .
STATISTICS IN MEDICINE, 2005, 24 (11) :1713-1723
[6]   High-Dimensional Cox Models: The Choice of Penalty as Part of the Model Building Process [J].
Benner, Axel ;
Zucknick, Manuela ;
Hielscher, Thomas ;
Ittrich, Carina ;
Mansmann, Ulrich .
BIOMETRICAL JOURNAL, 2010, 52 (01) :50-69
[7]   Benchmark of filter methods for feature selection in high-dimensional gene expression survival data [J].
Bommert, Andrea ;
Welchowski, Thomas ;
Schmid, Matthias ;
Rahnenfuehrer, Joerg .
BRIEFINGS IN BIOINFORMATICS, 2022, 23 (01)
[8]   Independent filtering increases detection power for high-throughput experiments [J].
Bourgon, Richard ;
Gentleman, Robert ;
Huber, Wolfgang .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2010, 107 (21) :9546-9551
[9]   Predicting survival from microarray data -: a comparative study [J].
Bovelstad, H. M. ;
Nygard, S. ;
Storvold, H. L. ;
Aldrin, M. ;
Borgan, O. ;
Frigessi, A. ;
Lingjaerde, O. C. .
BIOINFORMATICS, 2007, 23 (16) :2080-2087
[10]   Survival prediction from clinico-genomic models - a comparative study [J].
Bovelstad, Hege M. ;
Nygard, Stale ;
Borgan, Ornulf .
BMC BIOINFORMATICS, 2009, 10