Semi-supervised recursively partitioned mixture models for identifying cancer subtypes

被引:52
作者
Koestler, Devin C. [1 ]
Marsit, Carmen J. [2 ]
Christensen, Brock C. [2 ,3 ]
Karagas, Margaret R. [4 ]
Bueno, Raphael [5 ]
Sugarbaker, David J. [5 ]
Kelsey, Karl T. [2 ,3 ]
Houseman, E. Andres [3 ,6 ]
机构
[1] Brown Univ, Dept Community Hlth, Biostat Sect, Providence, RI 02912 USA
[2] Brown Univ, Dept Pathol & Lab Med, Providence, RI 02912 USA
[3] Brown Univ, Dept Community Hlth, Ctr Environm Hlth & Technol, Providence, RI 02912 USA
[4] Dartmouth Med Sch, Dept Community & Family Med, Lebanon, NH 03756 USA
[5] Harvard Univ, Sch Med, Brigham & Womens Hosp, Div Thorac Surg, Boston, MA 02115 USA
[6] Harvard Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
基金
美国国家卫生研究院;
关键词
ACUTE MYELOID-LEUKEMIA; HIGH-DIMENSIONAL DATA; DNA METHYLATION; BREAST-CANCER; PLEURAL MESOTHELIOMA; LUNG ADENOCARCINOMA; PATIENT SURVIVAL; CELL CARCINOMA; EXPRESSION; ALGORITHM;
D O I
10.1093/bioinformatics/btq470
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Patients with identical cancer diagnoses often progress differently. The disparity we see in disease progression and treatment response can be attributed to the idea that two histologically similar cancers may be completely different diseases on the molecular level. Methods for identifying cancer subtypes associated with patient survival have the capacity to be powerful instruments for understanding the biochemical processes that underlie disease progression as well as providing an initial step toward more personalized therapy for cancer patients. We propose a method called semi-supervised recursively partitioned mixture models (SS-RPMM) that utilizes array-based genetic and patient-level clinical data for finding cancer subtypes that are associated with patient survival. Results: In the proposed SS-RPMM, cancer subtypes are identified using a selected subset of genes that are associated with survival time. Since survival information is used in the gene selection step, this method is semi-supervised. Unlike other semi-supervised clustering classification methods, SS-RPMM does not require specification of the number of cancer subtypes, which is often unknown. In a simulation study, our proposed method compared favorably with other competing semi-supervised methods, including: semi-supervised clustering and supervised principal components analysis. Furthermore, an analysis of mesothelioma cancer data using SS-RPMM, revealed at least two distinct methylation profiles that are informative for survival.
引用
收藏
页码:2578 / 2585
页数:8
相关论文
共 33 条
[1]   Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling [J].
Alizadeh, AA ;
Eisen, MB ;
Davis, RE ;
Ma, C ;
Lossos, IS ;
Rosenwald, A ;
Boldrick, JG ;
Sabet, H ;
Tran, T ;
Yu, X ;
Powell, JI ;
Yang, LM ;
Marti, GE ;
Moore, T ;
Hudson, J ;
Lu, LS ;
Lewis, DB ;
Tibshirani, R ;
Sherlock, G ;
Chan, WC ;
Greiner, TC ;
Weisenburger, DD ;
Armitage, JO ;
Warnke, R ;
Levy, R ;
Wilson, W ;
Grever, MR ;
Byrd, JC ;
Botstein, D ;
Brown, PO ;
Staudt, LM .
NATURE, 2000, 403 (6769) :503-511
[2]   Comprehensive profiling of DNA methylation in colorectal cancer reveals subgroups with distinct clinicopathological and molecular features [J].
Ang, Pei Woon ;
Loh, Marie ;
Liem, Natalia ;
Lim, Pei Li ;
Grieu, Fabienne ;
Vaithilingam, Aparna ;
Platell, Cameron ;
Yong, Wei Peng ;
Iacopetta, Barry ;
Soong, Richie .
BMC CANCER, 2010, 10
[3]   Semi-supervised methods to predict patient survival from gene expression data [J].
Bair, E ;
Tibshirani, R .
PLOS BIOLOGY, 2004, 2 (04) :511-522
[4]   Gene-expression profiles predict survival of patients with lung adenocarcinoma [J].
Beer, DG ;
Kardia, SLR ;
Huang, CC ;
Giordano, TJ ;
Levin, AM ;
Misek, DE ;
Lin, L ;
Chen, GA ;
Gharib, TG ;
Thomas, DG ;
Lizyness, ML ;
Kuick, R ;
Hayasaka, S ;
Taylor, JMG ;
Iannettoni, MD ;
Orringer, MB ;
Hanash, S .
NATURE MEDICINE, 2002, 8 (08) :816-824
[5]   Use of gene-expression profiling to identify prognostic subclasses in adult acute myeloid leukemia [J].
Bullinger, L ;
Döhner, K ;
Bair, E ;
Fröhling, S ;
Schlenk, RF ;
Tibshirani, R ;
Döhner, H ;
Pollack, JR .
NEW ENGLAND JOURNAL OF MEDICINE, 2004, 350 (16) :1605-1616
[6]   OPTIMAL RATE OF CONVERGENCE FOR FINITE MIXTURE-MODELS [J].
CHEN, JH .
ANNALS OF STATISTICS, 1995, 23 (01) :221-233
[7]   A transcriptional fingerprint of estrogen in human breast cancer predicts patient survival [J].
Chinnaiyan, Arul M. ;
Lippman, Marc E. ;
Yu, Jianjun ;
Yu, Jindan ;
Cordero, Kevin E. ;
Johnson, Michael D. ;
Ghosh, Debashis ;
Rae, James M. .
NEOPLASIA, 2008, 10 (01) :79-88
[8]   Aging and Environmental Exposures Alter Tissue-Specific DNA Methylation Dependent upon CpG Island Context [J].
Christensen, Brock C. ;
Houseman, E. Andres ;
Marsit, Carmen J. ;
Zheng, Shichun ;
Wrensch, Margaret R. ;
Wiemels, Joseph L. ;
Nelson, Heather H. ;
Karagas, Margaret R. ;
Padbury, James F. ;
Bueno, Raphael ;
Sugarbaker, David J. ;
Yeh, Ru-Fang ;
Wiencke, John K. ;
Kelsey, Karl T. .
PLOS GENETICS, 2009, 5 (08)
[9]   Differentiation of Lung Adenocarcinoma, Pleural Mesothelioma, and Nonmalignant Pulmonary Tissues Using DNA Methylation Profiles [J].
Christensen, Brock C. ;
Marsit, Carmen J. ;
Houseman, E. Andres ;
Godleski, John J. ;
Longacker, Jennifer L. ;
Zheng, Shichun ;
Yeh, Ru-Fang ;
Wrensch, Margaret R. ;
Wiemels, Joseph L. ;
Karagas, Margaret R. ;
Bueno, Raphael ;
Sugarbaker, David J. ;
Nelson, Heather H. ;
Wiencke, John K. ;
Kelsey, Karl T. .
CANCER RESEARCH, 2009, 69 (15) :6315-6321
[10]   Epigenetic Profiles Distinguish Pleural Mesothelioma from Normal Pleura and Predict Lung Asbestos Burden and Clinical Outcome [J].
Christensen, Brock C. ;
Houseman, E. A. ;
Godleski, John J. ;
Marsit, Carmen J. ;
Longacker, Jennifer L. ;
Roelofs, Cora R. ;
Karagas, Margaret R. ;
Wrensch, Margaret R. ;
Yeh, Ru-Fang ;
Nelson, Heather H. ;
Wiemels, Joe L. ;
Zheng, Shichun ;
Wiencke, John K. ;
Bueno, Raphael ;
Sugarbaker, David J. ;
Kelsey, Karl T. .
CANCER RESEARCH, 2009, 69 (01) :227-234