Tumor classification and marker gene prediction by feature selection and fuzzy c-means clustering using microarray data

被引:50
|
作者
Wang, JB [1 ]
Bo, TH
Jonassen, I
Myklebost, O
Hovig, E
机构
[1] Norwegian Radium Hosp, Dept Tumor Biol, N-0310 Oslo, Norway
[2] Univ Bergen, HIB, Dept Informat, N-5020 Bergen, Norway
[3] Univ Bergen, Bergen Ctr Computat Sci, Computat Biol Unit, N-5020 Bergen, Norway
[4] Univ Oslo, Dept Mol Biosci, N-0316 Oslo, Norway
关键词
D O I
10.1186/1471-2105-4-60
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Using DNA microarrays, we have developed two novel models for tumor classification and target gene prediction. First, gene expression profiles are summarized by optimally selected Self-Organizing Maps (SOMs), followed by tumor sample classification by Fuzzy C-means clustering. Then, the prediction of marker genes is accomplished by either manual feature selection (visualizing the weighted/mean SOM component plane) or automatic feature selection (by pair-wise Fisher's linear discriminant). Results: The proposed models were tested on four published datasets: (1) Leukemia (2) Colon cancer (3) Brain tumors and (4) NCI cancer cell lines. The models gave class prediction with markedly reduced error rates compared to other class prediction approaches, and the importance of feature selection on microarray data analysis was also emphasized. Conclusions: Our models identify marker genes with predictive potential, often better than other available methods in the literature. The models are potentially useful for medical diagnostics and may reveal some insights into cancer classification. Additionally, we illustrated two limitations in tumor classification from microarray data related to the biology underlying the data, in terms of (1) the class size of data, and (2) the internal structure of classes. These limitations are not specific for the classification models used.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Application of fuzzy ARTMAP and fuzzy c-means clustering to pattern classification with incomplete data
    Chee Peng Lim
    Mei Ming Kuan
    Robert F. Harrison
    Neural Computing & Applications, 2005, 14 : 104 - 113
  • [23] A Fuzzy C-Means Clustering-Based Hybrid Multivariate Time Series Prediction Framework With Feature Selection
    Zhan, Jianming
    Huang, Xianfeng
    Qian, Yuhua
    Ding, Weiping
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2024, 32 (08) : 4270 - 4284
  • [24] Microarray Time-Series Data Clustering Using Rough-Fuzzy C-Means Algorithm
    Maji, Pradipta
    Paul, Sushmita
    2011 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM 2011), 2011, : 269 - 272
  • [25] Application of fuzzy ARTMAP and fuzzy c-means clustering to pattern classification with incomplete data
    Lim, C
    Kuan, M
    Harrison, R
    NEURAL COMPUTING & APPLICATIONS, 2005, 14 (02): : 104 - 113
  • [26] Butterfly Optimized Feature Selection with Fuzzy C-Means Classifier for Thyroid Prediction
    Kumar, S. J. K. Jagadeesh
    Parthasarathi, P.
    Masud, Mehedi
    Al-Amri, Jehad F.
    Abouhawwash, Mohamed
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 35 (03): : 2909 - 2924
  • [27] Audio signal segmentation and classification using fuzzy c-means clustering
    Nitanda, Naoki
    Haseyama, Miki
    Kitajima, Hideo
    Systems and Computers in Japan, 2006, 37 (04): : 23 - 34
  • [28] Classification of soothing music using Fuzzy C-Means clustering algorithm
    Hsu, Ya-Wen
    Tsai, Hong-Pin
    Chiu, Ming-Chuan
    Hwang, Sheue-Ling
    Shih, Hsiang-Lan
    Huang, Fang-Ting
    Lee, Chun-Ting
    BRIDGING RESEARCH AND GOOD PRACTICES TOWARDS PATIENT WELFARE: HEALTHCARE SYSTEMS ERGONOMICS AND PATIENT SAFETY 2014, 2015, : 337 - 345
  • [30] Classification of Parkinson's disease using feature weighting method on the basis of fuzzy C-means clustering
    Polat, Kemal
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2012, 43 (04) : 597 - 609