Evaluation of clustering algorithms for gene expression data using gene ontology annotations

被引:3
|
作者
Ma Ning [1 ]
Zhang Zheng-guo [1 ]
机构
[1] Chinese Acad Med Sci, Peking Union Med Coll, Inst Basic Med Sci, Dept Biomed Engn,Sch Basic Med, Beijing 100005, Peoples R China
关键词
microarray; gene expression; clustering; gene ontology; TOOL;
D O I
10.3760/cma.j.issn.0366-6999.2012.17.015
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Background Clustering is a useful exploratory technique for interpreting gene expression data to reveal groups of genes sharing common functional attributes. Biologists frequently face the problem of choosing an appropriate algorithm. We aimed to provide a standalone, easily accessible and biologically oriented criterion for expression data clustering evaluation. Methods An external criterion utilizing annotation based similarities between genes is proposed in this work. Gene ontology information is employed as the annotation source. Comparisons among six widely used clustering algorithms over various types of gene expression data sets were carried out based on the criterion proposed. Results The rank of these algorithms given by the criterion coincides with our common knowledge. Single-linkage has significantly poorer performance, even worse than the random algorithm. Ward's method archives the best performance in most cases. Conclusions The criterion proposed has a strong ability to distinguish among different clustering algorithms with different distance measurements. It is also demonstrated that analyzing main contributors of the criterion may offer some guidelines in finding local compact clusters. As an addition, we suggest using Ward's algorithm for gene expression data analysis. Chin Med J 2012;125(17):3048-3052
引用
收藏
页码:3048 / 3052
页数:5
相关论文
共 50 条
  • [21] Biological evaluation of biclustering algorithms using Gene Ontology and chIP-chip data
    Tchagang, Alain B.
    Tewfik, Ahmed H.
    Benos, Panayiotis V.
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 637 - +
  • [22] Ontology-Driven Co-clustering of Gene Expression Data
    Cordero, Francesca
    Pensa, Ruggero G.
    Visconti, Alessia
    Ienco, Dino
    Botta, Marco
    AI (ASTERISK) IA 2009: EMERGENT PERSPECTIVES IN ARTIFICIAL INTELLIGENCE, 2009, 5883 : 426 - +
  • [23] Combining Expression Data and Knowledge Ontology for Gene Clustering and Network Reconstruction
    Wei-Po Lee
    Chung-Hsun Lin
    Cognitive Computation, 2016, 8 : 217 - 227
  • [24] Combining Expression Data and Knowledge Ontology for Gene Clustering and Network Reconstruction
    Lee, Wei-Po
    Lin, Chung-Hsun
    COGNITIVE COMPUTATION, 2016, 8 (02) : 217 - 227
  • [25] Biologically supervised hierarchical clustering algorithms for gene expression data
    Boratyn, Grzegorz M.
    Datta, Susmita
    Datta, Somnath
    2006 28TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-15, 2006, : 5681 - +
  • [26] Evaluation and optimization of clustering in gene expression data analysis
    Famili, AF
    Liu, GM
    Liu, ZY
    BIOINFORMATICS, 2004, 20 (10) : 1535 - 1545
  • [27] Evaluation Algorithms Based on Fuzzy C-means for the Data Clustering of Cancer Gene Expression
    Al-Janabee, Omar
    Al-Sarray, Basad
    Iraqi Journal for Computer Science and Mathematics, 2022, 3 (02): : 27 - 41
  • [28] Multi-View Gene Clustering using Gene Ontology and Expression-based Similarities
    Giri, Swagarika Jaharlal
    Saha, Sriparna
    2020 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2020,
  • [29] Use and misuse of the gene ontology annotations
    Seung Yon Rhee
    Valerie Wood
    Kara Dolinski
    Sorin Draghici
    Nature Reviews Genetics, 2008, 9 : 509 - 515
  • [30] Use and misuse of the gene ontology annotations
    Rhee, Seung Yon
    Wood, Valerie
    Dolinski, Kara
    Draghici, Sorin
    NATURE REVIEWS GENETICS, 2008, 9 (07) : 509 - 515