Improving Breast Cancer Survival Analysis through Competition-Based Multidimensional Modeling

被引:59
作者
Bilal, Erhan [1 ]
Dutkowski, Janusz [2 ,3 ]
Guinney, Justin [4 ]
Jang, In Sock [4 ]
Logsdon, Benjamin A. [4 ,5 ]
Pandey, Gaurav [6 ,7 ]
Sauerwine, Benjamin A. [4 ]
Shimoni, Yishai [8 ,9 ]
Vollan, Hans Kristian Moen [9 ,10 ,11 ,12 ,13 ,14 ]
Mecham, Brigham H. [4 ]
Rueda, Oscar M. [12 ,13 ]
Tost, Jorg [15 ]
Curtis, Christina [16 ]
Alvarez, Mariano J. [8 ,9 ]
Kristensen, Vessela N. [10 ,17 ]
Aparicio, Samuel [18 ,19 ]
Borresen-Dale, Anne-Lise [10 ,11 ]
Caldas, Carlos [11 ,12 ,13 ,20 ,21 ,22 ]
Califano, Andrea [8 ,9 ,23 ,24 ,25 ,26 ]
Friend, Stephen H. [4 ]
Ideker, Trey
Schadt, Eric E. [6 ]
Stolovitzky, Gustavo A. [1 ]
Margolin, Adam A. [4 ]
机构
[1] IBM TJ Watson Res, Yorktown Hts, NY USA
[2] Univ Calif San Diego, Dept Med, La Jolla, CA 92093 USA
[3] Univ Calif San Diego, Dept Bioengn, La Jolla, CA 92093 USA
[4] Sage Bionetworks, Seattle, WA USA
[5] Fred Hutchinson Canc Res Ctr, Div Publ Hlth Sci, Seattle, WA 98104 USA
[6] Icahn Sch Med Mt Sinai, Dept Genet & Genom Sci, New York, NY USA
[7] Icahn Sch Med,Mt Sinai, Icahn Inst Genom & Multiscale Biol, New York, NY USA
[8] Columbia Univ, Columbia Initiat Syst Biol, Columbia, NY USA
[9] Columbia Univ, Ctr Computat Biol & Bioinformat, New York, NY USA
[10] Oslo Univ Hosp, Inst Canc Res, Dept Genet, Oslo, Norway
[11] Univ Oslo, Fac Med, Inst Clin Med, KG Jebsen Ctr Breast Canc Res, Oslo, Norway
[12] Canc Res UK, Cambridge Res Inst, Cambridge, England
[13] Univ Cambridge, Dept Oncol, Cambridge, England
[14] Oslo Univ Hosp, Div Canc Med Surg & Transplantat, Dept Oncol, Oslo, Norway
[15] CEA, Inst Genom, Ctr Natl Genotypage, Lab Epigenet & Environm, Evry, France
[16] Univ So Calif, Keck Sch Med, Dept Prevent Med, Los Angeles, CA 90033 USA
[17] Akershus Univ Hosp, Div Med, Dept Clin Mol Biol, Ahus, Norway
[18] Univ British Columbia, Dept Pathol & Lab Med, Vancouver, BC, Canada
[19] British Colombia Canc Res Ctr, Vancouver, BC, Canada
[20] Cambridge Expt Canc Med Ctr, Cambridge, England
[21] Cambridge Univ Hosp NHS Fdn Trust, Cambridge Breast Unit, Cambridge, England
[22] Addenbrookes Hosp, NIHR Cambridge Biomed Res Ctr, Cambridge, England
[23] Columbia Univ, Dept Biomed Informat, New York, NY USA
[24] Columbia Univ, Dept Biochem & Mol Biophys, New York, NY USA
[25] Columbia Univ, Inst Canc Genet, New York, NY USA
[26] Columbia Univ, Herbert Irving Comprehens Canc Ctr, New York, NY USA
关键词
GENE-EXPRESSION SIGNATURE; COMPUTATIONAL BIOLOGY; CLAUDIN-LOW; PROFILES; VERIFICATION; INSTABILITY; PREDICTION;
D O I
10.1371/journal.pcbi.1003047
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Breast cancer is the most common malignancy in women and is responsible for hundreds of thousands of deaths annually. As with most cancers, it is a heterogeneous disease and different breast cancer subtypes are treated differently. Understanding the difference in prognosis for breast cancer based on its molecular and phenotypic features is one avenue for improving treatment by matching the proper treatment with molecular subtypes of the disease. In this work, we employed a competition-based approach to modeling breast cancer prognosis using large datasets containing genomic and clinical information and an online real-time leaderboard program used to speed feedback to the modeling team and to encourage each modeler to work towards achieving a higher ranked submission. We find that machine learning methods combined with molecular features selected based on expert prior knowledge can improve survival predictions compared to current best-in-class methodologies and that ensemble models trained across multiple user submissions systematically outperform individual models within the ensemble. We also find that model scores are highly consistent across multiple independent evaluations. This study serves as the pilot phase of a much larger competition open to the whole research community, with the goal of understanding general strategies for model optimization using clinical and molecular profiling data and providing an objective, transparent system for assessing prognostic models.
引用
收藏
页数:16
相关论文
共 59 条
  • [1] [Anonymous], 2011, PRECISION MED BUILDI
  • [2] The value of feedback in forecasting competitions
    Athanasopoulos, George
    Hyndman, Rob J.
    [J]. INTERNATIONAL JOURNAL OF FORECASTING, 2011, 27 (03) : 845 - 849
  • [3] Assessment of CASP8 structure predictions for template free targets
    Ben-David, Moshe
    Noivirt-Brik, Orly
    Paz, Aviv
    Prilusky, Jaime
    Sussman, Joel L.
    Levy, Yaakov
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2009, 77 : 50 - 65
  • [4] An embryonic stem cell-like gene expression signature in poorly differentiated aggressive human tumors
    Ben-Porath, Ittai
    Thomson, Matthew W.
    Carey, Vincent J.
    Ge, Ruping
    Bell, George W.
    Regev, Aviv
    Weinberg, Robert A.
    [J]. NATURE GENETICS, 2008, 40 (05) : 499 - 507
  • [5] BENNETT J, 2007, COMMUN ACM, V52, P8
  • [6] Large meta-analysis of multiple cancers reveals a common, compact and highly prognostic hypoxia metagene (vol 103, pg 929, 2010)
    Buffa, F. M.
    Harris, A. L.
    West, C. M.
    Miller, C. J.
    [J]. BRITISH JOURNAL OF CANCER, 2010, 103 (06) : 929 - 929
  • [7] Clinical application of the 70-gene profile: The MINDACT trial
    Cardoso, Fatima
    Van't Veer, Laura
    Rutgers, Emiel
    Loi, Sherene
    Mook, Stella
    Piccart-Gebhart, Martine J.
    [J]. JOURNAL OF CLINICAL ONCOLOGY, 2008, 26 (05) : 729 - 735
  • [8] A signature of chromosomal instability inferred from gene expression profiles predicts clinical outcome in multiple human cancers
    Carter, Scott L.
    Eklund, Aron C.
    Kohane, Isaac S.
    Harris, Lyndsay N.
    Szallasi, Zoltan
    [J]. NATURE GENETICS, 2006, 38 (09) : 1043 - 1048
  • [9] CARVALHO B, PD GENOMEWIDESNP 6 P
  • [10] Survival analysis part I: Basic concepts and first analyses
    Clark, TG
    Bradburn, MJ
    Love, SB
    Altman, DG
    [J]. BRITISH JOURNAL OF CANCER, 2003, 89 (02) : 232 - 238