Empirical Evidence of the Applicability of Functional Clustering through Gene Expression Classification

被引:8
|
作者
Krejnik, Milos [1 ]
Klema, Jiri [1 ]
机构
[1] Czech Tech Univ, Dept Cybernet, Fac Elect Engn, Prague 16627 6, Czech Republic
关键词
Biological prior knowledge; gene expression; gene set analysis; clustering; feature extraction; classification; MICROARRAY DATA; CANCER; TOOLS; PREDICTION; EPITHELIUM; SELECTION; QUALITY;
D O I
10.1109/TCBB.2012.23
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The availability of a great range of prior biological knowledge about the roles and functions of genes and gene-gene interactions allows us to simplify the analysis of gene expression data to make it more robust, compact, and interpretable. Here, we objectively analyze the applicability of functional clustering for the identification of groups of functionally related genes. The analysis is performed in terms of gene expression classification and uses predictive accuracy as an unbiased performance measure. Features of biological samples that originally corresponded to genes are replaced by features that correspond to the centroids of the gene clusters and are then used for classifier learning. Using 10 benchmark data sets, we demonstrate that functional clustering significantly outperforms random clustering without biological relevance. We also show that functional clustering performs comparably to gene expression clustering, which groups genes according to the similarity of their expression profiles. Finally, the suitability of functional clustering as a feature extraction technique is evaluated and discussed.
引用
收藏
页码:788 / 798
页数:11
相关论文
共 50 条
  • [1] Ensemble classification for gene expression data based on parallel clustering
    Meng, Jun
    Jiang, Dingling
    Zhang, Jing
    Luan, Yushi
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2018, 20 (03) : 213 - 229
  • [2] Impact of missing data imputation methods on gene expression clustering and classification
    de Souto, Marcilio C. P.
    Jaskowiak, Pablo A.
    Costa, Ivan G.
    BMC BIOINFORMATICS, 2015, 16
  • [3] Impact of missing data imputation methods on gene expression clustering and classification
    Marcilio CP de Souto
    Pablo A Jaskowiak
    Ivan G Costa
    BMC Bioinformatics, 16
  • [4] Clustering of high throughput gene expression data
    Pirim, Harun
    Eksioglu, Burak
    Perkins, Andy D.
    Yuceer, Cetin
    COMPUTERS & OPERATIONS RESEARCH, 2012, 39 (12) : 3046 - 3061
  • [5] Gene encoder: a feature selection technique through unsupervised deep learning-based clustering for large gene expression data
    Uzma
    Al-Obeidat, Feras
    Tubaishat, Abdallah
    Shah, Babar
    Halim, Zahid
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (11) : 8309 - 8331
  • [6] Functional embedding for the classification of gene expression profiles
    Wu, Ping-Shi
    Mueller, Hans-Georg
    BIOINFORMATICS, 2010, 26 (04) : 509 - 517
  • [7] Gene expression based cancer classification
    Tarek, Sara
    Abd Elwahab, Reda
    Shoman, Mahmoud
    EGYPTIAN INFORMATICS JOURNAL, 2017, 18 (03) : 151 - 159
  • [8] Fuzzy clustering-based discretization for gene expression classification
    Keivan Kianmehr
    Mohammed Alshalalfa
    Reda Alhajj
    Knowledge and Information Systems, 2010, 24 : 441 - 465
  • [9] Fuzzy clustering-based discretization for gene expression classification
    Kianmehr, Keivan
    Alshalalfa, Mohammed
    Alhajj, Reda
    KNOWLEDGE AND INFORMATION SYSTEMS, 2010, 24 (03) : 441 - 465
  • [10] Proximity Measures for Clustering Gene Expression Microarray Data: A Validation Methodology and a Comparative Analysis
    Jaskowiak, Pablo A.
    Campello, Ricardo J. G. B.
    Costa, Ivan G.
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2013, 10 (04) : 845 - 857