Breast cancer prognosis by combinatorial analysis of gene expression data

被引:208
作者
Alexe G. [1 ,2 ,3 ]
Alexe S. [1 ]
Axelrod D.E. [4 ,5 ]
Bonates T.O. [1 ]
Lozina I.I. [1 ]
Reiss M. [5 ,6 ]
Hammer P.L. [1 ]
机构
[1] RUTCOR (Rutgers University Center for Operations Research), Piscataway, NJ
[2] Computational Biology Center, TJ Watson IBM Research, Yorktown Heights, NY
[3] The Simons Center for Systems Biology, Institute for Advanced Study, Princeton, NJ
[4] Department of Genetics, Rutgers University, Piscataway, NJ
[5] The Cancer Institute of New Jersey, New Brunswick, NJ
[6] Division of Medical Oncology, UMDNJ-Robert Wood Johnson Medical School, New Brunswick, NJ
关键词
Breast Cancer; Negative Case; Negative Pattern; Positive Pattern; Prognostic System;
D O I
10.1186/bcr1512
中图分类号
学科分类号
摘要
Introduction: The potential of applying data analysis tools to microarray data for diagnosis and prognosis is illustrated on the recent breast cancer dataset of van't Veer and coworkers. We re-examine that dataset using the novel technique of logical analysis of data (LAD), with the double objective of discovering patterns characteristic for cases with good or poor outcome, using them for accurate and justifiable predictions; and deriving novel information about the role of genes, the existence of special classes of cases. and other factors. Method: Data were analyzed using the combinatorics and optimization-based method of LAD, recently shown to provide highly accurate diagnostic and prognostic systems in cardiology, cancer proteomics, hematology, pulmonology, and other disciplines. Results: LAD identified a subset of 17 of the 25,000 genes, capable of fully distinguishing between patients with poor, respectively good prognoses. An extensive list of 'patterns' or 'combinatorial biomarkers' (that is, combinations of genes and limitations on their expression levels) was generated, and 40 patterns were used to create a prognostic system, shown to have 100% and 92.9% weighted accuracy on the training and test sets, respectively. The prognostic system uses fewer genes than other methods, and has similar or better accuracy than those reported in other studies. Out of the 17 genes identified by LAD, three (respectively, five) were shown to play a significant role in determining poor (respectively, good) prognosis. Two new classes of patients (described by similar sets of covering patterns, gene expression ranges, and clinical features) were discovered. As a by-product of the study, it is shown that the training and the test sets of van't Veer have differing characteristics. Conclusion: The study shows that LAD provides an accurate and fully explanatory prognostic system for breast cancer using genomic data (that is, a system that, in addition to predicting good or poor prognosis, provides an individualized explanation of the reasons for that prognosis for each patient). Moreover, the LAD model provides valuable insights into the roles of individual and combinatorial biomarkers, allows the discovery of new classes of patients, and generates a vast library of biomedical research hypotheses. © 2006 Alexe et al; licensee BioMed Central Ltd.
引用
收藏
相关论文
共 50 条
  • [41] Intrinsic Subtypes of Primary Breast Cancer - Gene Expression Analysis
    Schmidt, Marcus
    Thomssen, Christoph
    Untch, Michael
    ONCOLOGY RESEARCH AND TREATMENT, 2016, 39 (03) : 102 - 110
  • [42] Analysis of HOX Gene Expression Patterns in Human Breast Cancer
    Ho Hur
    Ji-Yeon Lee
    Hyo Jung Yun
    Byeong Woo Park
    Myoung Hee Kim
    Molecular Biotechnology, 2014, 56 : 64 - 71
  • [43] Systemic analysis of the expression levels and prognosis of breast cancer-related cadherins
    Xu, Mingfei
    Liu, Chaoyue
    Pu, Lulan
    Lai, Jinrong
    Li, Jingjia
    Ning, Qianwen
    Liu, Xin
    Deng, Shishan
    EXPERIMENTAL BIOLOGY AND MEDICINE, 2021, 246 (15) : 1706 - 1720
  • [44] Development of in vivo gene expression system in mice for breast cancer gene analysis
    Tagaya, Hiroaki
    Ishikawa, Kosuke
    Watanabe, Shinya
    Semba, Kentaro
    CANCER SCIENCE, 2018, 109 : 179 - 179
  • [45] Analysis of HOX Gene Expression Patterns in Human Breast Cancer
    Hur, Ho
    Lee, Ji-Yeon
    Yun, Hyo Jung
    Park, Byeong Woo
    Kim, Myoung Hee
    MOLECULAR BIOTECHNOLOGY, 2014, 56 (01) : 64 - 71
  • [46] Predictive Data Analytics for Breast Cancer Prognosis
    Chauhan, Ritu
    Kumar, Neeraj
    ADVANCED COMPUTING AND INTELLIGENT ENGINEERING, 2020, 1082 : 253 - 262
  • [47] Analysis of breast cancer related gene expression based on Cancer Browser database
    Song, Feiyang
    Guo, Ling
    Mao, Leer
    Lu, Waiting
    PROCEEDINGS OF THE 38TH CHINESE CONTROL CONFERENCE (CCC), 2019, : 6982 - 6986
  • [48] Comparative Analysis of Data Mining Algorithms for Cancer Gene Expression Data
    Thareja, Preeti
    Chhillar, Rajender Singh
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (10) : 322 - 328
  • [49] Role of Circ-ITCH Gene Polymorphisms and Its Expression in Breast Cancer Susceptibility and Prognosis
    Saadawy, Sara F.
    Raafat, Nermin
    Samy, Walaa M.
    Raafat, Ahmed
    Talaat, Aliaa
    DIAGNOSTICS, 2023, 13 (12)
  • [50] The effect of survivin gene in breast cancer risk and prognosis
    Mashadiyeva, Roya
    Cacina, Canan
    Arikan, Soykan
    Surmen, Saime
    Demirkol, Seyda
    Aksakal, Nihat
    Yaylim, Ilhan
    TURKISH JOURNAL OF BIOCHEMISTRY-TURK BIYOKIMYA DERGISI, 2023, 48 (02): : 168 - 174