Integrative Analysis of Prognosis Data on Multiple Cancer Subtypes

被引:16
作者
Liu, Jin [1 ]
Huang, Jian [2 ]
Zhang, Yawei [3 ]
Lan, Qing [4 ]
Rothman, Nathaniel [4 ]
Zheng, Tongzhang [3 ]
Ma, Shuangge [3 ]
机构
[1] UIC Sch Publ Hlth, Div Epidemiol & Biostat, Chicago, IL 60612 USA
[2] Univ Iowa, Dept Stat & Actuarial Sci, Iowa City, IA 52242 USA
[3] Yale Univ, Sch Publ Hlth, New Haven, CT USA
[4] NCI, Div Canc Epidemiol & Genet, NIH, Bethesda, MD 20892 USA
基金
美国国家科学基金会;
关键词
Cancer prognosis; Integrative analysis; Marker identification; Penalization; IDENTIFICATION; CONVERGENCE;
D O I
10.1111/biom.12177
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
In cancer research, profiling studies have been extensively conducted, searching for genes/SNPs associated with prognosis. Cancer is diverse. Examining the similarity and difference in the genetic basis of multiple subtypes of the same cancer can lead to a better understanding of their connections and distinctions. Classic meta-analysis methods analyze each subtype separately and then compare analysis results across subtypes. Integrative analysis methods, in contrast, analyze the raw data on multiple subtypes simultaneously and can outperform meta-analysis methods. In this study, prognosis data on multiple subtypes of the same cancer are analyzed. An AFT (accelerated failure time) model is adopted to describe survival. The genetic basis of multiple subtypes is described using the heterogeneity model, which allows a gene/SNP to be associated with prognosis of some subtypes but not others. A compound penalization method is developed to identify genes that contain important SNPs associated with prognosis. The proposed method has an intuitive formulation and is realized using an iterative algorithm. Asymptotic properties are rigorously established. Simulation shows that the proposed method has satisfactory performance and outperforms a penalization-based meta-analysis method and a regularized thresholding method. An NHL (non-Hodgkin lymphoma) prognosis study with SNP measurements is analyzed. Genes associated with the three major subtypes, namely DLBCL, FL, and CLL/SLL, are identified. The proposed method identifies genes that are different from alternatives and have important implications and satisfactory prediction performance.
引用
收藏
页码:480 / 488
页数:9
相关论文
共 15 条
[1]  
[Anonymous], 2006, Journal of the Royal Statistical Society, Series B
[2]   Exploring the human diseasome: the human disease network [J].
Goh, Kwang-Il ;
Choi, In-Geol .
BRIEFINGS IN FUNCTIONAL GENOMICS, 2012, 11 (06) :533-542
[3]   Identification of Predictive Pathways for Non-Hodgkin Lymphoma Prognosis [J].
Han, Xuesong ;
Li, Yang ;
Huang, Jian ;
Zhang, Yawei ;
Holford, Theodore ;
Lan, Qing ;
Rothman, Nathaniel ;
Zheng, Tongzhang ;
Kosorok, Michael R. ;
Ma, Shuangge .
CANCER INFORMATICS, 2010, 9 :281-292
[4]   SEMIPARAMETRIC REGRESSION PURSUIT [J].
Huang, Jian ;
Wei, Fengrong ;
Ma, Shuangge .
STATISTICA SINICA, 2012, 22 (04) :1403-1426
[5]   Variable selection in the accelerated failure time model via the bridge method [J].
Huang, Jian ;
Ma, Shuangge .
LIFETIME DATA ANALYSIS, 2010, 16 (02) :176-195
[6]   A group bridge approach for variable selection [J].
Huang, Jian ;
Ma, Shuange ;
Xie, Huiliang ;
Zhang, Cun-Hui .
BIOMETRIKA, 2009, 96 (02) :339-355
[7]  
Lee J, 2010, PROCEEDINGS OF THE 17TH INTERNATIONAL CONGRESS ON SOUND AND VIBRATION
[8]   Integrative Analysis of Cancer Diagnosis Studies with Composite Penalization [J].
Liu, Jin ;
Ma, Shuangge ;
Huang, Jian .
SCANDINAVIAN JOURNAL OF STATISTICS, 2014, 41 (01) :87-103
[9]   Integrative Analysis of Cancer Prognosis Data With Multiple Subtypes Using Regularized Gradient Descent [J].
Ma, Shuangge ;
Zhang, Yawei ;
Huang, Jian ;
Huang, Yuan ;
Lan, Qing ;
Rothman, Nathaniel ;
Zheng, Tongzhang .
GENETIC EPIDEMIOLOGY, 2012, 36 (08) :829-838
[10]   Identification of non-Hodgkin's lymphoma prognosis signatures using the CTGDR method [J].
Ma, Shuangge ;
Zhang, Yawei ;
Huang, Jian ;
Han, Xuesong ;
Holford, Theodore ;
Lan, Qing ;
Rothman, Nathaniel ;
Boyle, Peter ;
Zheng, Tongzhang .
BIOINFORMATICS, 2010, 26 (01) :15-21