Tumor classification and marker gene prediction by feature selection and fuzzy c-means clustering using microarray data

被引:50
|
作者
Wang, JB [1 ]
Bo, TH
Jonassen, I
Myklebost, O
Hovig, E
机构
[1] Norwegian Radium Hosp, Dept Tumor Biol, N-0310 Oslo, Norway
[2] Univ Bergen, HIB, Dept Informat, N-5020 Bergen, Norway
[3] Univ Bergen, Bergen Ctr Computat Sci, Computat Biol Unit, N-5020 Bergen, Norway
[4] Univ Oslo, Dept Mol Biosci, N-0316 Oslo, Norway
关键词
D O I
10.1186/1471-2105-4-60
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Using DNA microarrays, we have developed two novel models for tumor classification and target gene prediction. First, gene expression profiles are summarized by optimally selected Self-Organizing Maps (SOMs), followed by tumor sample classification by Fuzzy C-means clustering. Then, the prediction of marker genes is accomplished by either manual feature selection (visualizing the weighted/mean SOM component plane) or automatic feature selection (by pair-wise Fisher's linear discriminant). Results: The proposed models were tested on four published datasets: (1) Leukemia (2) Colon cancer (3) Brain tumors and (4) NCI cancer cell lines. The models gave class prediction with markedly reduced error rates compared to other class prediction approaches, and the importance of feature selection on microarray data analysis was also emphasized. Conclusions: Our models identify marker genes with predictive potential, often better than other available methods in the literature. The models are potentially useful for medical diagnostics and may reveal some insights into cancer classification. Additionally, we illustrated two limitations in tumor classification from microarray data related to the biology underlying the data, in terms of (1) the class size of data, and (2) the internal structure of classes. These limitations are not specific for the classification models used.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Tumor classification and marker gene prediction by feature selection and fuzzy c-means clustering using microarray data
    Junbai Wang
    Trond Hellem Bø
    Inge Jonassen
    Ola Myklebost
    Eivind Hovig
    BMC Bioinformatics, 4
  • [2] CLUSTERING MICROARRAY GENE EXPRESSION DATA USING FUZZY C-MEANS AND DTW DISTANCE
    Taghizad, H.
    Mehridehnavi, A.
    2011 3RD INTERNATIONAL CONFERENCE ON COMPUTER TECHNOLOGY AND DEVELOPMENT (ICCTD 2011), VOL 1, 2012, : 395 - 399
  • [3] Cancer Classification using Fuzzy C-Means with Feature Selection
    Rachman, Arvan Aulia
    Rustam, Zuherman
    2016 12TH INTERNATIONAL CONFERENCE ON MATHEMATICS, STATISTICS, AND THEIR APPLICATIONS (ICMSA), 2016, : 31 - 34
  • [4] Fuzzy C-means method for clustering microarray data
    Dembélé, D
    Kastner, P
    BIOINFORMATICS, 2003, 19 (08) : 973 - 980
  • [5] Feature clustering and feature discretization assisting gene selection for molecular classification using fuzzy c-means and expectation–maximization algorithm
    Hung-Yi Lin
    The Journal of Supercomputing, 2021, 77 : 5381 - 5397
  • [6] Assessment of reliability of microarray data using Fuzzy c-Means classification
    Alci, M
    Asyali, MH
    NEURAL INFORMATION PROCESSING, 2004, 3316 : 1322 - 1327
  • [7] Combining Fuzzy C-Means Clustering with Fuzzy Rough Feature Selection
    Zhao, Ruonan
    Gu, Lize
    Zhu, Xiaoning
    APPLIED SCIENCES-BASEL, 2019, 9 (04):
  • [8] Feature clustering and feature discretization assisting gene selection for molecular classification using fuzzy c-means and expectation-maximization algorithm
    Lin, Hung-Yi
    JOURNAL OF SUPERCOMPUTING, 2021, 77 (06): : 5381 - 5397
  • [9] The modified fuzzy c-means method for clustering of microarray data
    Taraskina, A. S.
    Cheremushkin, E. S.
    PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON BIOINFORMATICS OF GENOME REGULATION AND STRUCTURE, VOL 1, 2006, : 180 - +
  • [10] Fuzzy C-Means Text Clustering with Supervised Feature Selection
    Wang, Wei
    Wang, Chunheng
    Cui, Xia
    Wang, Ai
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 1, PROCEEDINGS, 2008, : 57 - 61