A Toolbox for Functional Analysis and the Systematic Identification of Diagnostic and Prognostic Gene Expression Signatures Combining Meta-Analysis and Machine Learning

被引:11
作者
Vey, Johannes [1 ,2 ]
Kapsner, Lorenz A. [3 ]
Fuchs, Maximilian [1 ,4 ]
Unberath, Philipp [4 ]
Veronesi, Giulia [5 ]
Kunz, Meik [4 ]
机构
[1] Univ Wurzburg, Dept Bioinformat, Funct Genom & Syst Biol Grp, D-97074 Wurzburg, Germany
[2] Heidelberg Univ, Inst Med Biometry & Informat, Neuenheimer Feld 130-3, D-69120 Heidelberg, Germany
[3] Erlangen Univ Hosp, Ctr Med Informat & Commun Technol, D-91054 Erlangen, Germany
[4] Friedrich Alexander Univ Erlangen Nurnberg, Chair Med Informat, D-91058 Erlangen, Germany
[5] Humanitas Res Hosp, Unit Thorac Surg, Via Manzoni 56, I-20089 Milan, Italy
关键词
Bioinformatics tool; R package; machine learning; meta-analysis; biomarker signature; gene expression analysis; survival analysis; functional analysis; LUNG-CANCER; PACKAGE; REGULARIZATION; SELECTION; MODELS;
D O I
10.3390/cancers11101606
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
The identification of biomarker signatures is important for cancer diagnosis and prognosis. However, the detection of clinical reliable signatures is influenced by limited data availability, which may restrict statistical power. Moreover, methods for integration of large sample cohorts and signature identification are limited. We present a step-by-step computational protocol for functional gene expression analysis and the identification of diagnostic and prognostic signatures by combining meta-analysis with machine learning and survival analysis. The novelty of the toolbox lies in its all-in-one functionality, generic design, and modularity. It is exemplified for lung cancer, including a comprehensive evaluation using different validation strategies. However, the protocol is not restricted to specific disease types and can therefore be used by a broad community. The accompanying R package vignette runs in similar to 1 h and describes the workflow in detail for use by researchers with limited bioinformatics training.
引用
收藏
页数:14
相关论文
共 60 条
[1]   SurvMicro: assessment of miRNA-based prognostic signatures for cancer clinical outcomes by multivariate survival analysis [J].
Aguirre-Gamboa, Raul ;
Trevino, Victor .
BIOINFORMATICS, 2014, 30 (11) :1630-1632
[2]   SurvExpress: An Online Biomarker Validation Tool and Database for Cancer Gene Expression Data Using Survival Analysis [J].
Aguirre-Gamboa, Raul ;
Gomez-Rueda, Hugo ;
Martinez-Ledesma, Emmanuel ;
Martinez-Torteya, Antonio ;
Chacolla-Huaringa, Rafael ;
Rodriguez-Barrientos, Alberto ;
Tamez-Pena, Jose G. ;
Trevino, Victor .
PLOS ONE, 2013, 8 (09)
[3]  
Alboukadel K, survminer: Drawing Survival Curves using "ggplot2
[4]   Diagnostic role of circulating extracellular matrix-related proteins in non-small cell lung cancer [J].
Andriani, Francesca ;
Landoni, Elena ;
Mensah, Mavis ;
Facchinetti, Federica ;
Miceli, Rosalba ;
Tagliabue, Elda ;
Giussani, Marta ;
Callari, Maurizio ;
De Cecco, Loris ;
Colombo, Mario Paolo ;
Roz, Luca ;
Pastorino, Ugo ;
Sozzi, Gabriella .
BMC CANCER, 2018, 18
[5]  
[Anonymous], 2016, KDD16 P 22 ACM, DOI DOI 10.1145/2939672.2939785
[6]   Systematic Analysis of Breast Cancer Morphology Uncovers Stromal Features Associated with Survival [J].
Beck, Andrew H. ;
Sangoi, Ankur R. ;
Leung, Samuel ;
Marinelli, Robert J. ;
Nielsen, Torsten O. ;
van de Vijver, Marc J. ;
West, Robert B. ;
van de Rijn, Matt ;
Koller, Daphne .
SCIENCE TRANSLATIONAL MEDICINE, 2011, 3 (108)
[7]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[8]   Precision diagnostics: moving towards protein biomarker signatures of clinical utility in cancer [J].
Borrebaeck, Carl A. K. .
NATURE REVIEWS CANCER, 2017, 17 (03) :199-204
[9]   Survival Analysis Part II: Multivariate data analysis - an introduction to concepts and methods [J].
Bradburn, MJ ;
Clark, TG ;
Love, SB ;
Altman, DG .
BRITISH JOURNAL OF CANCER, 2003, 89 (03) :431-436
[10]   A Meta-analysis of Lung Cancer Gene Expression Identifies PTK7 as a Survival Gene in Lung Adenocarcinoma [J].
Chen, Ron ;
Khatri, Purvesh ;
Mazur, Pawel K. ;
Polin, Melanie ;
Zheng, Yanyan ;
Vaka, Dedeepya ;
Hoang, Chuong D. ;
Shrager, Joseph ;
Xu, Yue ;
Vicent, Silvestre ;
Butte, Atul J. ;
Sweet-Cordero, E. Alejandro .
CANCER RESEARCH, 2014, 74 (10) :2892-2902