Fuzzy-Rough Supervised Attribute Clustering Algorithm and Classification of Microarray Data

被引:50
作者
Maji, Pradipta [1 ]
机构
[1] Indian Stat Inst, Machine Intelligence Unit, Kolkata 700108, India
来源
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS | 2011年 / 41卷 / 01期
关键词
Attribute clustering; classification; gene selection; microarray analysis; rough sets; GENE-EXPRESSION; SELECTION;
D O I
10.1109/TSMCB.2010.2050684
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
One of the major tasks with gene expression data is to find groups of coregulated genes whose collective expression is strongly associated with sample categories. In this regard, a new clustering algorithm, termed as fuzzy-rough supervised attribute clustering (FRSAC), is proposed to find such groups of genes. The proposed algorithm is based on the theory of fuzzy-rough sets, which directly incorporates the information of sample categories into the gene clustering process. A new quantitative measure is introduced based on fuzzy-rough sets that incorporates the information of sample categories to measure the similarity among genes. The proposed algorithm is based on measuring the similarity between genes using the new quantitative measure, whereby redundancy among the genes is removed. The clusters are refined incrementally based on sample categories. The effectiveness of the proposed FRSAC algorithm, along with a comparison with existing supervised and unsupervised gene selection and clustering algorithms, is demonstrated on six cancer and two arthritis data sets based on the class separability index and predictive accuracy of the naive Bayes' classifier, the K-nearest neighbor rule, and the support vector machine.
引用
收藏
页码:222 / 233
页数:12
相关论文
共 25 条
  • [1] [Anonymous], 1988, Algorithms for Clustering Data
  • [2] Attribute clustering for grouping, selection, and classification of gene expression data
    Au, WH
    Chan, KCC
    Wong, AKC
    Wang, Y
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2005, 2 (02) : 83 - 101
  • [3] GO::TermFinder - open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes
    Boyle, EI
    Weng, SA
    Gollub, J
    Jin, H
    Botstein, D
    Cherry, JM
    Sherlock, G
    [J]. BIOINFORMATICS, 2004, 20 (18) : 3710 - 3715
  • [4] Cherkassky V, 1997, IEEE Trans Neural Netw, V8, P1564, DOI 10.1109/TNN.1997.641482
  • [5] Fuzzy C-means method for clustering microarray data
    Dembélé, D
    Kastner, P
    [J]. BIOINFORMATICS, 2003, 19 (08) : 973 - 980
  • [6] Dettling M, 2002, GENOME BIOL, V3
  • [7] DEVIJVER PA, 1982, PATTERN RECOGNITION
  • [8] Minimum redundancy feature selection from microarray gene expression data
    Ding, C
    Peng, HC
    [J]. PROCEEDINGS OF THE 2003 IEEE BIOINFORMATICS CONFERENCE, 2003, : 523 - 528
  • [9] ROUGH FUZZY-SETS AND FUZZY ROUGH SETS
    DUBOIS, D
    PRADE, H
    [J]. INTERNATIONAL JOURNAL OF GENERAL SYSTEMS, 1990, 17 (2-3) : 191 - 209
  • [10] Gasch AP, 2002, GENOME BIOL, V3